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HUMAN HOMOLOGUE OF UNC-53 PROTEIN OF G ELEGANS 



The present invention relates to a vertebrate 
5 homologue of UNC-53 protein of C. eleaans and cDNA 
sequences coding for said homologues or functional 
equivalents thereof- The invention also relates to 
processes for identifying compounds which control cell 
behaviour, compounds identified and pharmaceutical 

10 compositions containing them in addition to processes 
and assays for identifying disease states in which 
said gene or protein is dysfunctional. 

The control of cell motility, cell shape and 
directionality of cell outgrowth of axones or other 

15 cell outgrowths is an essential feature in the 

morphogenesis and function of both unicellular and 
multicellular organisms. 

Some cell surface proteins and extra-cellular 
molecules controlling the directionality and potential 

20 . of cell migration have been . identified, although the 
processes involved are not generally understood. It 
is generally considered that a long-range migration of 
a cell process (also known as a growth cone extension) 
is a stepwise event, whereby prior to and after each 

25 extension there is the formation of a structure at the 
leading edge of the cell. Localised stabilisation of 
the actin cytoskeleton and association with plus end 
regions of microtubules is a general cell biological 
process underlying the choice of directional 

30 extension. 

The present inventors have surprisingly found a 
new human gene/protein belonging to thie UNC-53 family 
that binds microtubules and, in particular, the plus- 
end regions of microtubules. 

35 A gene from the free-living nematode 

Caenorhabditis eleaans designated "unc-53" has been 
previously identified and cloned (Abstract, 
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International C. eleaans Meeting, June 1-5 1991, 
Madison, Wisconsin, 58, Bogaert and Goh) . The present 
inventors previously identified UNC-53 protein as a 
signal transducer or signal integrator controlling the 
5 directionality of cell migration and/or cell shape in 
C. eleaans (WO 96/38555) . 

The C. eleaans UNC-53 protein (Ceunc53) and 
previously found human homologues thereof (hs-unc53/l 
and hs-unc53/2) were found to encode a signal 

10 transducer or a signal integrator, controlling the 
directionality of a cell migration, cell shape and 
growth extension. Evidence indicates that the 
presently found homologue designated (hs-unc53/3) 
might act as an adapter linking extracellular signals 

15 to the actin cytoskeleton. Firstly hs-unc-53/3 shows 
homology to the cortical actin binding proteins, and 
the Ce-UNC-53 protein has been shown to bind F-actin 
in vitro and leads to actin re-organization in vivo 
when expressed in mammalian cells, leading to an 

20 increased number of filopodia and lammelipodia . 

Furthermore, increased neurite extension and increased 
cell motility could be observed. Hs-UNC-53-3 may play 
an important role in the development of various 
diseases . 

25 According to a first aspect of the present 

invention there is provided a vertebrate protein 
homologue of an UNC-53 protein of C. eleaans / which 
protein comprises an amino acid sequence having one or 
more of sequence blocks A, B, C, D, E, F, G or H as 

30 illustrated in figure 4 or which differs from said 
blocks in conservative amino acid changes. 

According to a further aspect of the present 
invention, there is provided a vertebrate protein 
homologue of UNC-53 protein of C. eleaans or a 

35 functional equivalent, derivative or bioprecursor 

thereof, having an amino acid sequence encoded by the 
nucleotide sequence illustrated in figure 1(e). 
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For the purposes of the present invention a 
"derivative" should be taken to mean mutational 
derivatives, fusions, internal deletions, splice 
variants and muteins. 
5 Preferably, said vertebrate homologue is a human 

protein, and preferably a mammalian or a mouse 
protein. 

A further aspect of the invention comprises a 
vertebrate homologue comprising an amino acid sequence 

10 as shown in figure 1(f) or the variants thereof or an 
amino acid sequence which differs from the amino acid 
sequences shown in figure 1(f) to a significant extent 
only in one or more conservative amino acid changes. 
In a further aspect of the present invention 

15 there is also provided a nucleic acid molecule, which 
is preferably DNA, and which encodes a vertebrate 
homologue of UNC-53 protein of C. elecrans , or a 
functional equivalent derivative, fragment or 
bioprecursor of said homologue according to the 

20 invention. Preferably, the cDNA comprises a sequence 
of nucleotides encoding an amino acid sequence as 
illustrated in figure 1(f) or the variants thereof or 
an amino acid which differs from the sequences shown 
in these figures to a significant extent only in one 

25 or more conservative amino acid changes. Preferably 
the DNA is cDNA, which cDNA comprises the sequence 
shown in figure 1(e) or the variants indicated therein . 
Also provided by the present invention is a nucleic 
acid sequence capable of hybridising to the nucleic 

30 acid or DNA sequences according to the invention under 
high stringency conditions, which conditions are well 
known to those skilled in the art. 

The cDNA according to the invention may be 
included in an expression vector which may itself be 

35 used to transform or transfect a host cell, which cell 
may be bacterial or eukaryotic in origin including 
such as, for example an animal or plant cell a fungal 
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cell or an insect cell. Thus, advantageously, once 
the cDNA corresponding to the genome of the vertebrate 
homologue of UNC-53 of C. eleaans according to the 
invention is synthesised, using for example, reverse 
5 transcriptase or the like, a range of cells, tissues 
or organisms may be transfected following 
incorporation of the selected cDNA clone into an 
appropriate expression vector. The expression vector 
according to the invention may comprise a promoter of 

10 C. elegans or one of human, mouse or viral origin and 
optionally a sequence encoding a reporter molecule, 
such as, for example, green fluorescent protein. 

The present invention, therefore, also further 
comprises a transgenic cell, tissue or organism 

15 comprising a transgene capable of expressing a 

vertebrate homologue of UNC-53 protein of C, eleaans 
according to the invention. The term "transgene 
capable of expressing a vertebrate homologue of UNC-53 
protein of C. eleaans " as used herein means a suitable 

20 nucleic acid sequence which leads to the expression of 
a vertebrate homologue of UNC-53 protein of C. eleaans 
according to the invention having the same function 
and/or activity. The transgene may include, for 
example/ genomic nucleic acid isolated from the 

25 appropriate vertebrate or synthetic nucleic acid 
including cDNA. The term "transgenic organisms, 
tissues or cells, as used herein means any suitable 
organism and/or part of an organism, tissue or cell, 
that contains exogenous nucleic acid either stably 

30 integrated in the genome or in an extrachromosomal 
state . 

Preferably the transgenic cell comprises any of, 
a COS cell, HepG2 cell, MCF-7 or N4 neuroblastoma 
cell, a NIH3T3 cell, a colorectal or carcinoma cell or 
35 a human derived cell such as a fibroblast or the like. 
The transgenic organism may be an insect, a non-human 
animal or a plant and preferably C, eleaans or a 
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related nematode. Preferably, the transgene comprises 
the nucleic acid or cDNA sequence encoding the 
vertebrate homologue according to the invention as 
described above. The transgene preferably comprises an 
5 expression vector according to the invention. 

The term "functional fragment" as used herein 
should be taken to mean a fragment of the gene coding 
for the vertebrate homologue of the UNC-53 protein of 
C. eleaans according to the invention. For example, 
10 the gene may comprise deletions or mutations but may 

still encode a functional vertebrate homologue of UNC- 
53 protein. 

Further provided by the present invention is a 
method of producing a mutant vertebrate non-human 

15 organism having a mutation in the wild-type gene 

coding for the vertebrate homologue of UNC-53 protein 
according to the invention, which mutation affects 
cell behaviour or the regulation of cell motility or 
the shape or the direction of cell migration or 

20 microtubule plus end stability or function and 

localisation of protein complexes located thereon, 
which method comprises inducing a mutation in the 
vertebrate homologue of UNC-53 protein in said 
organism. These mutant organisms may be used in a 

25 screen to identify the effects of compounds on these 
cell functions. 

The vertebrate homologue of UNC-53 protein of 
C. eleaans or the cDNA or genomic DNA encoding it or a 
functional equivalent, derivative, fragment or 

30 bioprecursor of said homologue, may advantageously be 
used as a medicament, or in the preparation of a 
medicament to treat or prevent disorders associated 
with inhibition of overexpression of the vertebrate 
homologue of UNC -53 according to the invention. Such 

35 disorders may be alleviated by promoting neuronal 

regeneration, revascularisation or wound healing or 
the treatment of chronic neurodegenerative disorders, 
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psychiatric disorders or acute traumatic injuries or 
f ibrotic disease or disease in which physiological • 
events requiring the polarity of cells or epithelia 
are abnormally functioning. Accordingly, the 
5 vertebrate homologue according to the invention, 
dominant positive or negative mutants thereof, or 
inhibitors thereof may advantageously be used to 
induce or alleviate contact inhibition in a cell or in 
preventing carcinoma development. Typically, the 

10 above medical conditions may be treated in mammals and 
more preferably humans by either the homologue of UNC- 
53 protein or alternatively by a nucleic acid coding 
for the protein or the protein itself according to the 
invention. Alternatively an antisense oligonucleotide 

15 to said UNC-53 vertebrate homologue may be used to 
prevent it's expression. Examples of other nucleic 
acid sequences which may be used include 3' 
untranslated regions of mRNA which could be used to 
prevent transcription of the genomic sequence encoding 

20 for the vertebrate homologue of UNC-53 protein 
according to the invention. 

The vertebrate homologue of UNC-53 protein 
according to the invention may be incorporated into a 
pharmaceutical^ acceptable composition together with 

25 a suitable carrier, diluent or excipient therefor. 
The pharmaceutical composition may advantageously 
comprise, additionally or alternatively, the nucleic 
acid sequence according to the invention as defined 
above . 

30 The induction or inhibition of the expression of 

hu-UNC-53/3 by pharmacological means may 
advantageously be used to induce neuronal 
regeneration, revascularisation or wound healing or be 
involved in the treatment of chronical 

35 neurodegenerative disorders, or acute traumatic 

injuries or fibrotic diseases, or physiological events 
requiring the polarity of cells, or oncology and 
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metastasis of cells, or apoptotic pathways. 

The present invention therefore also provides- for 
a method of determining whether a compound is an 
inhibitor or enhancer of the regulation of cell 
5 behaviour, growth, transformation, cell shape or 
motility or the direction of cell migration, 
microtubule plus end stability or function and 
localisation of protein complexes thereon, which 
method comprises contacting said compound with a 

10 transgenic cell according to the invention and 

screening for a phenotypic change in said cell. The 
method can therefore be used to determine whether the 
compound comprises an inhibitor or an enhancer of the 
signal transduction pathway of said transgenic cell of 

15 which pathway said vertebrate homologue of UNC-53 

protein according to the invention is a component, or 
whether said compound is an inhibitor or an enhancer 
of a parallel or redundant signal transduction pathway 
in said cell. The present invention also provides a 

20 method to determine that the protein in said signal 

transduction pathway is a vertebrate homologue of UNC- 
53 protein of C. eleaans according to the invention. 

Preferably, the phenotypic change to be screened 
comprises a change in cell shape or a change in cell 

25 motility. Where a transgenic cell is used in 

accordance with one embodiment of the method of the 
invention, an N4 neuroblastoma cell may be used and in 
such an embodiment the phenotypic change to be 
screened may be the length of neurite growth, changes 

30 in filopodia outgrowth, changes in ruffling behaviour 
or cell adhesion, any change in microtubule 
cytoskeleton, any change in localisation of proteins 
on plus end regions of microtubules or any change in a 
cell such as apoptosis. In an alternative embodiment 

35 of the method of the invention, the transgenic cell 
may comprise an MCF-7 breast carcinoma cell. 
Typically in such an embodiment the phenotypic change 
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to be screened comprises the extent of phagokinesis or 
filopodia formation. In an alternative embodiment of 
this aspect of the invention, the transgenic cell may 
comprise an NIH3T3 cell. Typically in such an 
5 embodiment the phenotypic change to be screened 
comprises loss of contact inhibition of foci 
formation. The method according to the invention, may 
also utilise a mutant cell or mutant organism 
according to the invention as described above, where 

10 the mutant cell is capable of growing in tissue 
culture or in vivo and either of which cell or 
organism has a mutation in the wild-type unc-53 gene. 

In accordance with the present invention, a 
"phenotypic change", may comprise any phenotype 

15 resulting from changes at any suitable point in the 
life cycle of the cell, tissue or organism defined 
above, which change can be attributed to the 
expression of the transgene of the invention such as 
for example, growth, viability, morphology, behaviour, 

20 movement, cell migration or cell process or growth 

cone extension of cells and includes changes in body 
shape, locomotion, chemotaxis, contact inhibition, 
mating behaviour or the like. The phenotypic change 
may preferably be monitored directly by visual 

25 inspection of the cell as a whole or by monitoring the 
F-actin cytoskeleton microtubule network and plus end 
stability of microtubules or proteins thereon or 
alternatively by for example measuring indicators of 
viability including endogenous or transgenically 

30 introduced histochemical markers or other reporter 
genes, such as for example (3-galactosidase or green 
fluorescent protein. 

A compound which is identifiable by the method 
according to the invention as described above, as an 

35 enhancer of the processes identified above such as the 
regulation of cell shape or motility or the direction 
of* cell migration may be used as a medicament, or 
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alternatively in the preparation of a medicament, for 
promoting neuronal regeneration, revascularisation, or 
wound healing, or for treatment of chronic neuro- 
degenerative diseases or acute traumatic injuries or 
5 fibrotic disease. Examples of promoting neuronal 

regeneration include, for example, peripheral nerve 
regeneration after trauma and spinal cord trauma. 

Where a compound is identified in accordance with 
the method described above as being an inhibitor of 

10 the regulation of cell shape or mobility or the 

direction of cell migration, the compound may be used 
as a medicament, or in the preparation of a 
medicament, for substantially alleviating spread of 
disease inducing cells, such as in spread of 

15 carcinoma, or the like in metastasis or in alleviating 
loss of contact inhibition. Advantageously, any of 
the compounds which may have been identified as an 
inhibitor or an enhancer in accordance with the method 
as described above, may also be. included in a 

20 pharmaceutical ^composition comprising the respective 
compound and a pharmaceutically acceptable carrier, 
diluent or excipient therefor. 

The particular mechanism of action of a compound 
identified as either an inhibitor or an enhancer of 

25 the cell motility shape, growth or direction of cell 
migration or microtubule association or to the plus 
end region thereof is not limiting. Preferably the 
compound acts as an inhibitor or enhancer of a signal 
transduction pathway. The compound may also act on a 

30 parallel pathway or directly on the vertebrate 
homologue of UNC-53 protein of C. eleaans . For 
example, the method of action of the compound may 
include direct interaction with the vertebrate 
homologue of UNC-53 protein, interaction with 

35 processes for regulating phosphorylation or 

dephosphorylation of the vertebrate homologue of UNC- 
53 or with processes regulating activity of an unc-53 
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gene or with processes for post-transcriptional or 
post-translational modification or the like. 

Preferably the compound is identified by the 
method according to the invention as an inhibitor or 
5 an enhancer, by utilising differences of phenotype of 
the cell, tissue or organism, which are visible to the 
eye. Alternatively indicators of viability including 
endogenous or transgenically introduced histochemical 
markers or a reporter gene may be used. 

10 According to a further aspect of the invention 

there is also provided a transgenic cell or tissue 
culture which has been constructed to comprise a 
promoter sequence of a gene coding for a vertebrate 
homologue of UNC-53 of C . eleaans according to the 

15 invention operably linked to a nucleic acid sequence 
encoding a reporter molecule. Preferably, the 
reporter sequence encodes for a detectable protein, 
for example one which may be monitored by eye 
inspection such as antibiotic resistance, (S- 

20 galactosidase or a molecule detectable by 

spectropho tome trie, spectrof luorometric, luminescent 
or radioactive assays. 

The present invention also provides a method of 
determining whether a compound is an inhibitor or an 

25 enhancer of transcription of a gene coding for a 

vertebrate homologue of UNC-53 protein in C. eleaans , 
according to the invention which method comprises the 
steps of: 

(a) contacting said compound with a transgenic 
30 cell according to the invention as described 

above, 

(b) monitoring the level of said reporter 
molecule and comparing results obtained from this 
monitoring step with a control comprising a 

35 transgenic cell having the promoter sequence of a 

gene coding for a vertebrate homologue of UNC-53 
protein, or a functional fragment of said 
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homologue and the reporter molecule, in the 
absence of the compound. 

In one embodiment of the method according to this 
aspect of the invention the reporter molecule may 
5 comprise messenger RNA. 

A compound identified as an enhancer of 
transcription of the gene coding for the vertebrate 
homologue of UNC-53 protein of C. eleaans or a 
functional equivalent, derivative or bioprecursor of 

10 said homologue may also be used as a medicament, or in 
the preparation of a medicament, for promoting 
neuronal regeneration, revascularisation or wound 
healing, or for treatment of chronic neuro- 
degenerative diseases or acute traumatic injuries or 

15 fibrotic disease. Furthermore, such compounds may be 
included in a pharmaceutical composition including a 
pharmaceutically acceptable carrier, diluent or 
excipient therefor. . Any compounds identified as 
inhibitors of transcription may, advantageously, be 

20 used in alleviating the spread of disease inducing 
cells such as carcinomas or metastasis or loss of 
contact inhibition . 

The present invention also provides a kit for 
determining whether a compound is an enhancer or an 

25 inhibitor of the regulation of cell growth, 

transformation, cell motility or shape or the 
direction of cell migration which kit comprises at 
least one transgenic or mutant cell or transgenic or 
mutant non-human organism according to the invention 

30 as described above and a plurality of wild-type cells 
or a wild-type organism of the same type, or a cell 
line or tissue culture and means for contacting said 
compound with said cell or organism. 

Also provided by the present invention is a kit 

35 for determining, whether a compound is an inhibitor or 
an enhancer of transcription of a gene coding for a 
vertebrate homologue of UNC-53 protein of C. eleaans 
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according to the invention which kit comprises at 
least one transgenic cell or cells according to the 
invention, means for contacting said compounds with 
said cells and means for monitoring the level of 
5 transcription of said transgenic cell or cells 
according to the invention. 

For the purposes of the present invention, the 
term "gene coding for a vertebrate homologue of UNC-53 
or a functional fragment of said homologue" includes 

10 the nucleic acid sequence shown in figure 1 or a 

fragment thereof, including the differentially spliced 
isoforms and transcriptional starts of the nucleic 
acid sequence and which sequence encodes a vertebrate 
homologue of UNC-53 protein or a functional 

15 equivalent, derivative, fragment or bioprecursor of 
the protein. 

The present invention also provides methods of 
identifying genes of vertebrates or fragments of said 
genes, which encode proteins which are active in the 

20 signal transduction pathway of which the vertebrate 

homologue of UNC-53 according to the present invention 
is a component. A preferred method comprises 
hybridizing to an appropriate cDNA library a 
nucleotide sequence, as defined herein, or a fragment 

25 thereof under appropriate conditions of stringency in 
order to identify genes having statistically 
significant homology with the cDNA clones of any one 
of the cDNA sequences according to the invention 
described above. 

30 Furthermore, there is also provided by the 

present invention a method of identifying a protein 
which is active in the signal transduction pathway of 
a cell of which a vertebrate homologue of UNC-53 
protein of C. eleaans according to the invention is a 

35 component. According to this aspect of the invention, 
the method comprises; 

(a) contacting an extract of said cell with an 
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antibody to the vertebrate homologue of UNC-53 
protein or a functional equivalent, fragment or 
bioprecursor of said protein, 

(b) identifying the antibody/vertebrate 
5 homologue of UNC-53 complex, and 

(c) analysing the complex to identify any 
protein bound to the vertebrate homologue of 
UNC-53 protein other than the antibody. 

The vertebrate homologue of UNC-53 protein, 

10 therefore may bind regions of other proteins involved 
in the signal transduction pathway. It is also 
possible to sequentially identify a whole range of 
proteins involved in the signal transduction pathway. 
Antibodies to the vertebrate homologue of UNC-53 

15 protein may be produced according to known techniques 
as would be known to those skilled in the art. For 
example, polyclonal antibodies may be prepared by 
inoculating a host animal, such as a mouse, with a 
protein or epitope of a protein according to the 

20 invention and recovering immune serum. 

This aspect of the invention, further comprises a 
method of identifying a further protein or proteins 
which are active in the signal transduction pathway of 
a cell of which the vertebrate homologue of UNC-53 is 

25 a component which method comprises: 

(a) forming an antibody to the first identified 
protein bound to the vertebrate homologue of 
UNC-53 protein in the method as described above, 

(b) contacting a cell extract with the antibody, 
30 (c) identifying any antibody/protein complex, 

(d) analysing the complex to identify any 
further protein bound to the first protein other 
than the antibody, and 

(e) optionally repeating steps (a) to (d) to 
35 identify further proteins in the pathway. 

According to this aspect of the present 
invention, the antibody starts the process by binding 
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to the vertebrate homologue of UNC-53 protein 
according to the invention in the signal transduction 
or oncogenic pathways. Any other proteins found 
complexed to the bound antibody or UNC-53 protein can 
5 then be used to identify further interacting proteins 
involved in the pathway. 

It may also be possible to identify proteins 
involved in the signal transduction pathway of a cell 
of which the vertebrate homologue of UNC-53 is a 
10 component by using a vertebrate homologue of UNC-53 
protein of C. eleaans . According to this aspect of 
the invention the method comprises: 

(a) contacting an extract of the cell with the 
vertebrate homologue of UNC-53 protein of 

15 C. eleaans or a functional equivalent, 

fragment or bioprecursor of said homologue, 

(b) identifying the vertebrate homologue of UNC- 
53 protein/protein complex formed and 

(c) analysing the complex to identify any 

20 protein bound to the vertebrate homologue of 

UNC-53 protein other than the same 
vertebrate homologue of UNC-53 protein. 
This method can also advantageously be used to 
identify further proteins in a signal transduction 
25 pathway of a cell by contacting an extract of the cell 
used as described above, with any protein identified 
from step (c) above not being a vertebrate homologue 
of UNC-53 protein and repeating steps (b) and (c) . 

Other methods which may be used for identifying 
30 proteins in a signal transduction pathway of a cell 

may comprise for example a western blot overlay method 
which method is well known to those skilled in the 
art. Cell extracts are run on gels to separate out 
protein and subsequently blotted onto a nylon 
35 membrane. These membranes may then be incubated, for 
example in a medium containing vertebrate homologue of 
UNC-53 having a label attached thereto such as a 
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biotin or radiolabel and any protein conjugates 
visualised with for example a streptavidin or alkaline 
phosphatase conjugated antibody. 

The present invention also advantageously 
5 provides a process for the preparation of binding 
antibodies which recognise proteins or fragments 
thereof involved in the rate and direction of cell 
migration or the control of cell growth or shape, for 
the above methods . 

10 The monoclonal antibody for binding to the 

appropriate vertebrate homologue of UNC-53 (or its 
functional equivalent) may be prepared by known 
techniques as described by Kohler R. and Milstein C, 
(1975) Nature 256, 495 to 497. 

15 Another method which may be used to identify 

proteins involved in the signal transduction pathway 
of a cell of which a vertebrate homologue of an UNO 5 3 
protein of C . eleaans according to the invention or is 
a component, involves investigating protein-protein 

20 interactions using the two-hybrid vector method. This 
method, which is well known to those skilled in the 
art was first developed in yeast by Chien et al 
(1991) . This technique is based on functional 
reconstruction in vivo of a transcription factor which 

25 activates a reporter gene. More particularly the 

technique comprises providing an appropriate host cell 
with a DNA construct comprising a reporter gene under 
the control of a promoter regulated by a transcription 
factor having a DNA binding domain and an activating 

30 domain, expressing in the host cell a first hybrid DNA 
sequence encoding a first fusion of a fragment or all 
of a nucleic acid sequence according to the invention 
and either said DNA binding domain or said activating 
domain of the transcription factor, expressing in the 

35 host at least one second hybrid DNA sequence, such as 
a library or the like, encoding putative binding 
proteins to be investigated together with the DNA 
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binding or activating domain of the transcription 
factor which is not incorporated in the first fusion; 
detecting any binding of the proteins to be 
investigated with a protein according to the invention 
5 by detecting for the presence of any reporter gene 

product in the host ceil; optionally isolating second 
hybrid DNA sequences encoding the binding protein. 

An example of such a technique utilises the GAL4 
protein in yeast. GAL 4 is a transcriptional activator 

10 of galactose metabolism in yeast and has a separate 
domain for binding to activators upstream of the 
galactose metabolising genes as well as a protein 
binding domain. Nucleotide vectors may be 
constructed, one of which comprises the nucleotide 

15 residues encoding the DNA binding domain of GAL4 . 

These binding domain residues may be fused to a known 
protein encoding sequence, such as for example a 
sequence coding for the vertebrate homologue of 
UNC-53. The other vector comprises the residues 

20 encoding the protein binding domain of GAL4 . These 
residues are fused to residues encoding a test 
protein, preferably from the signal transduction 
pathway of the vertebrate in question. Any interaction 
between the vertebrate homologue of UNC-53 protein and 

25 the protein to be tested leads to transcriptional 
activation of a reporter molecule in a GAL -4 
transcription deficient yeast cell into which the 
vectors have been transformed. Preferably, a reporter 
molecule such as fl-galactosidase is activated upon 

30 restoration of transcription of the yeast galactose 
metabolism genes. This method enables any 
interactions between proteins involved in the signal 
transduction pathway or a parallel or redundant 
pathway to be investigated. 

35 Any proteins identified in the signal 

transduction pathway of the cell, which may be for 
example a mammalian cell, may also be included in a 
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pharmaceutical composition together with a 
pharmaceutical!;/ acceptable carrier, diluent or 
excipient therefor . 

The present invention also provides a process for 
5 producing a vertebrate homologue of an UNC-53 protein 
of C. eleaans according to the invention which process 
comprises culturing the cells transformed or 
transfected with a cDNA expression vector having any 
of the cDNA sequences according to the invention as 

10 described above, and recovering the expressed protein 
homologue. The cell may advantageously be a 
bacterial, animal, insect or plant cell. 

A particularly preferred process for producing 
said vertebrate homologue of UNC-53 protein uses 

15 insect cells. Accordingly, the invention provides a 

process for producing a vertebrate homologue of UNC-53 
protein of C. eleaans according to the invention which 
process comprises culturing an insect cell transformed 
or transfected with a recombinant Baculovirus vector, 

20 said vector comprising a nucleotide sequence encoding 
said vertebrate homologue of UNC-53 protein according 
to the invention downstream of the Baculovirus 
polyhedrin promoter and recovering the expressed 
protein. Advantageously, this method produces large 

25 amounts of protein for recovery. The insect cell may 
be from for example Spodoptera f rugiperda or 
Drosophila Melanoaester . 

In accordance with the present invention, a 
defined nucleic acid sequence includes not only the 

30 identical nucleic acid but also any minor base 

variations from the natural nucleic acid sequence 
including in particular, substitutions in bases which 
result in a synonymous codon (a different codon 
specifying the same amino acid) , due to the degenerate 

35 code in conservative amino acid substitution. The 
term "nucleic acid sequence" also includes the 
complimentary sequence to any single stranded sequence 
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given which includes the definition above regarding 
base variations . 

Furthermore, a defined protein, polypeptide or 
amino acid sequence according to the invention, 
5 includes not only the identical amino acid sequence 

but also minor amino acid variations from the natural 
amino acid sequence including conservative amino acid 
replacements (a replacement by an amino acid that is 
related in its side chains) . Also included are amino 

10 acid sequences which vary from the natural amino acid 
but result in a polypeptide which is immunologically 
identical or similar to the polypeptide encoded by the 
naturally occurring sequence. Such polypeptides may 
be encoded by a corresponding nucleic acid sequence. 

15 A further aspect of the invention provides a 

nucleic acid sequence of at least 15 nucleotides of a 
nucleic acid according to the invention and preferably 
from 15 to 50 nucleotides. 

These sequences may, advantageously be used as 

20 probes or primers to initiate replication or the like. 
Such nucleic acid sequences may be produced according 
to techniques well known in the art, such as by 
recombinant or synthetic means. They may also be used 
in diagnostic kits or the like for detecting for the 

25 presence of a nucleic acid according to the invention. 
These test generally comprise contacting the probe 
with a sample under hybridising conditions and 
detecting for the presence of any duplex formation 
between the probe and any nucleic acid in the sample. 

30 Nucleic acid sequences according to the invention may 
also be produced using recombinant or synthetic means 
such as described in Sambrook et al (Molecular 
Cloning: A Laboratory Manual, 1989) . Advantageously, 
human allelic variants or polymorphisms of the DNA 

35 according to the invention may be identified by, for 
example, probing DNA from a range of individuals for 
example from different populations. Furthermore, 
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nucleic acids and probes according to the invention 
may be used to sequence genomic DNA from patients 
using techniques well known in the art, such as the 
Sanger Dideoxy chain termination method, which may 
5 advantageously ascertain any predisposition of a 
patient to certain disorders. 

A method of detecting whether a compound is an 
inhibitor or an enhancer or expression of a vertebrate 
homologue of UNC-53 of C, elegans , according to the 

10 invention is also provided which method comprises 

contacting a cell expressing said homologue with said 
compound and monitoring for a phenotypic change 
compared to a control cell which has not been 
contacted with said compound. 

15 Preferably the cell is a transgenic cell as 

described above. Alternatively the cell may have 
undergone loss of contact inhibition. 

The present method also provides for determining 
whether said compound is an inhibitor or expression of 

20 said vertebrate homologue. In one embodiment the 
compound to be tested comprises a nucleic acid. 

Preferably said nucleic acid sequence comprises 
an antisense DNA sequence or a mRNA sequence. 

Preferably said mRNA sequence comprises 3' 

25 untranslated regions of mRNA encoding for said 
vertebrate homologue* 

Alternatively, the compound to be tested may be a 
protein. Preferably, said protein comprises a protein 
having an amino acid sequence potentially suitable for 

30 inhibiting function of said vertebrate homologue and 
preferably comprises a protein identified by the 
methods as described herein. 

The present invention also provides a 
pharmaceutical composition comprising a compound, for 

35 example an antisense nucleic acid identified according 
to the above described method together with a 
pharmaceutically acceptable carrier, diluent or 
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excipient therefor . 

A nucleic acid sequence or protein identified 
according to this aspect of the invention may be used 
as a medicament, or in the preparation of a 
5 medicament, for treating loss of contact inhibition of 
cancer which is- mediated by vertebrate homologue of 
UNC-53 protein or a functional equivalent, fragment, 
derivative or bioprecursor of said homologue. 

Further provided by the invention is a nucleic 
10 acid as defined above for use in preparation of a 

medicament for inhibiting expression of a gene coding 
for a vertebrate homologue of UNC-53 protein of 
C. elecrans . 

Further provided by the invention is an assay for 

15 detecting expression of the vertebrate homologue of 
UNC-53 protein of C. eleaans in a vertebrate cell 
which assay comprises contacting a cell or an extract 
thereof with an antibody to said vertebrate homologue, 
which antibody is fused to a reporter molecule, 

2 0 removing any unbound antibody and monitoring for the 
presence of said reporter molecule. 

Preferably the reporter molecule is an antibody 
conjugated to for example a fluorophore such as 
fluorescein or alternatively to an enzyme such as 

25 strepavidin. 

There is also provided a method for detecting for 
expression of a gene coding for the vertebrate 
homologue of UNC-53 protein of the invention which 
method comprises contacting a probe specific for a 

30 nucleic acid of protein sequence coding for or 

corresponding to said vertebrate homologue according 
to the invention with a cell extract, which probe is 
linked to a reporter and analysing for the presence of 
said reporter. 

35 Preferably the probe is a complementary sequence 

to a region of mRNA transcribed from said gene 
encoding said vertebrate homologue of UNC-53 protein 
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according to the invention. 

Preferably the complimentary sequence is a 3' or 
5' untranslated region of said mRNA. Preferably said 
reporter may be a dig label, a fluorophore, a hapten 
5 or a radiolabel . 

Alternatively said probe may comprise an antibody 
specific for said vertebrate homologue of said UNC-53 
protein. 

Preferably the reporter is an antibody conjugated 

10 to for example a fluorophore such as fluorescein or 
alternatively an enzyme such as streptavidin . 

As described above, UNC-53 protein of C.elegans 
has been found to localise to microtubule and 
particularly to microtubule ( + ) ends. Therefore, 

15 there is provided by a further aspect of the present 
invention a method of determining whether a compound 
is an inhibitor or an enhancer of association of the 
UNC-53 homologue of the invention to microtubules or 
plus end regions thereof, which method comprises (a) 

20 contacting said compound with a transgenic cell, 
tissue or organism expressing said vertebrate 
homologue and which protein is operably linked to a 
reporter molecule (b) screening for the localisation 
of said reporter molecule as compared to a cell 

25 according to step (a) which has not been contacted 
with said compound. 

A compound identifiable by the above method also 
forms part of the present invention. Such a compound 
identified as an inhibitor of localisation or 

30 association of said vertebrate homologue with 

microtubules or the plus end region thereof may be 
used in alleviating the spread of disease inducing 
cells or metastasis or loss of contact inhibition. 
Further a compound identified as an enhancer of 

35 association of said vertebrate homologue with 

microtubules or the plus end region thereof may be 
used in for example promoting neuronal regeneration, 
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revascularisation or wound healing, or for treating 
chronic neurodegenerative diseases or acute traumatic 
injuries or fibrotic disease. These compounds may 
then be included in a pharmaceutical composition, 
5 together with a pharmaceutical^ acceptable carrier, 
diluent or excipient therefor. 

Also provided by the present invention is a kit 
for determining whether a compound is an inhibitor or 
an enhancer of association of the vertebrate homologue 

10 thereof according to the invention with microtubules 
or the plus end regions thereof, which kit comprises 
at least one transgenic cell expressing said UNC-53 
vertebrate protein homologue and a reporter molecule 
or a host or transgenic cell according to the 

15 invention and at least one cell of the same cell type 
for use as a control and means for contacting said 
compound with one of said at least one transgenic 
cells. Compounds identified as inhibitors or 
enhancers or microtubule association described above 

20 may advantageously be included in a composition and 
linked to said vertebrate homologue according to the 
invention to target the compounds to the microtubules 
or the plus end regions thereof. Such a composition 
may also comprise, for example, a suitable 

25 transfecting or transformation agent. 

According to a further aspect of the invention 
there is provided a method of targeting a protein to a 
cell microtubule or the plus end region thereof, which 
method comprises introducing into a host cell, tissue 

30 or organism a transgene comprising a sequence capable 
of expressing said UNC-53 vertebrate homologue 
according to the invention, which sequence is operably 
linked to a sequence encoding said protein to be 
targeted such that a chimeric protein is expressed and 

35 which results in targeting of said protein to said 
microtubule or a plus end region thereof. An even 
further aspect of the invention comprises a method of 
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identifying a molecule which covalently modifies UNC- 
said vertebrate homologue according to the invention, 
which method comprises a) contacting either an extract 
from a cell or cells expressing said vertebrate 
5 homologue or a mixture of enzymes comprising candidate 
UNC-53 modifying enzymes in the presence of an 
indicator of covalent modification of a protein, b) 
identifying any covalently modified UNC-53 protein 
from step a) and c) identifying said molecule involved 
10 in said modification step. Such an indicator may be 
32 P. 

Further provided by the invention is a method of 
identifying a compound which alleviates or enhances 
the toxicity of said UNC-53 vertebrate homologue 

15 thereof according to the invention, or which 

alleviates or enhances apoptosis. The method of the 
former comprises contacting said compound with a 
transgenic cell, tissue or organism according to the 
invention and monitoring for the presence of said 

20 reporter molecule adjacent said microtubules or the 

plus end region thereof. In the case of apoptosis the 
method comprises monitoring the effect of the compound 
on cell death. 

The invention may be more clearly understood from 

25 the following examples which are purely exemplary, 

with reference to the accompanying drawings wherein, 

Figure 1(a) is an illustration of the nucleotide 
sequence encoding the first human homologue of UNC-53 
designated Hs-UNC-53/1 and further variants thereof. 

30 Figure 1 (b) is an illustration of the amino acid 

sequence of hs-UNC-53/1 encoded by the sequences in 
Figure 1 (a) . 

Figure 1(c) is an illustration of the nucleotide 
sequence encoding the second human homologue of UNC-53 
35 protein of C, eleaans designated Hs-UNC-53/2 and 
further variants thereof. 

Figure 1(d) is an illustration of the amino acid 
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sequences of Hs-UNC-53/2 encoded by the sequences in 
Figure 1(c). 

Figure 1 (e) is an illustration of a nucleotide 
sequence encoding the third human homologue of UNC-53 
5 protein according to the invention designated Hs-UNC- 
53/3, and variants thereof. 

Figure 1(f) is an illustration of the amino acid 
sequences of the Hs-UNC-53/3 encoded by the sequences 
of Figure 1(e). 
10 Figure 1(g) is an illustration of the nucleotide 

sequence of a genomic DNA fragment that contains a 
putative 5' exon of Hs-unc-53/1. 

Figure 1 (h) is an illustration of the nucleotide 
sequence AB023155 encoding the protein KIAA0938, a 
15 transcript comprising the 3' half of Hs-unc-53/3. 

Figure l(i) is an overview of the C. elegans and 
human UNC-53 proteins as cloned. The 5' truncated 
variants and a number of the known splice variants 
have been indicated. 
20 Figure 2 is an alignment of the amino acid 

sequences of Ce-UNC-53, Hs-UNC-53/1, Hs-UNC-53/2 and 
Hs-UNC-53/3. 

Figure 3 is an alignment of the C. elegans unc-53 
and the predicted amino acid sequence of C. briggsiae 
25 unc-53. 

Figure 4 is a list of ProSite signatures for 
vertebrate UNC-53s based on the sequence alignment. 

Figure 5a is an illustration of expression of the 
three human UNC-53s as studied by Northern blotting. 
30 Figure 5(b) is an illustration of differential 

expression of Hs-unc-53/3 in different brain parts. 

Figure 6(a) is an illustration of differential 
splice variant expression of Hs-unc-53/1 using RT-PCR. 

Figure 6(b) is an illustration of differential 
35 splice expression of Hs-unc-53/2 using RT-PCR. 

Figure 6(c) is an illustration of differential 
expression of Hs-unc-53/3 using RT-PCR. 
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Figure 6(d) is a sequence confirmation of 
AB023155 expression in cells other than brain using 
RT-PCR. 

Figure 7(a) is an illustration of the cloning of 
5 Hs-unc-53/3. 

Figure 7 (b) is a plasmid map and the nucleotide 
sequence of the pGI3303 expression vector ( C-terminal 
Hs-unc-53/3 fragment in fusion with GFP) . 

Figure 7(c) is an illustration of the amino acid 
10 sequence of GFP: C-terminal Hs-unc-53/3 fragment 
(insert of pGI3303) . 

Figure 7 (d) is a plasmid map and the nucleotide 
sequence of the pGI3305 expression vector (full length 
Hs-unc-53/3 in fusion with GFP) . 
15 Figure 7(e) is an illustration of the amino acid 

sequence of GFP : Hs-unc-53/3 (insert of pGI3305) . 

Figue 8 is an illustration of the filipodia and 
lamellipodia outgrowth of N4 mouse neuroblastoma cells 
transfected with pGI3303 (F-actin cytoskeleton 
20 reorganisation) 

Figure 9 is an illustration of the co- 
localisation of the GFP:Hs-unc-53/3 fusion protein 
with microtubules in N4 mouse neuroblastoma cells 
transfected with pGI3305. 
25 Figure 11a is an illustration of the homology 

domains between Hs-unc-53/3 and a gene encoded 
(partially) by the Drosophilia melanogaster BAC clone 
BACR48M05 (AC005719) . Results of a TBLASTN search on 
the non-redundant database with Hs-unc-53/3 as query. 
30 Figure lib is an illustration of an ORF encoded 

by the Drosophila melanogaster BAC clone BACR48M05 
(AC005719) as predicted by the computer program Fgene. 

Figure 11c is an illustration of a "BLAST 2 
sequences' 7 search result with Hs-unc-53/3 as query and 
35 the Fgene predicted UNC53 homology ORF of D, 
melanogaster BAC clone BACR4 8M0 5 . 

Figure 12 is an illustration of a zebra fish EST 
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encoding Dr-unc-53/2 . 

Figure 13 Genemap98 results for Hs-unc-53/2. 

Figure 14 is a schematical drawing of the 
sequence of the exon containing the putative 
5 alternative start codon of human Hs-unc-53/1. 

Figure 15 is an illustration of the nucleotide 
sequence of pGI3150 and the amino acid sequence of the 
eGFP fusion with a Oterminal fragment of Hs-Unc-53/1. 

Figure 16 is an alignment of EST clone yk480b6 
10 and Ce-unc-53 demonstrating a novel splice variant of 
Ce-unc-53 . 

Figure 17 is a graphical display of the effect of 
Hs-unc-53/3 GFP chimera transient transfection on the 
form factor of N4 cells. 



15 



25 



DEPOSITED MATERIAL 



Plasmids pG13303 and pG13305 were deposited under 
accession numbers LMBP3936 and LMBP3937 respectively 
20 on 28 May 1999 at the Belgian Coordinated Collections 
of Microorganisms (BCCM) at Laboratorium voor 
Moleculaire Biologie - Plasmidencollective (LMBP) B- 
9000 Ghent, Belgium, in accordance with the provisions 
of the Budapest Treaty of April 28 1977. 



Hs-UNC-53/3 is a bona fide UNC-53 (fig. 1; 2; 3) 



Blastn and Tblastn EST-database mining using the 
sequence of the already known animal UNC-53s led to 1 

30 the identification of 3 ESTs suggestive of novel unc- 
53s (see experimental procedures). By 3'- and 5'- 
RACE extension using suitable libraries, it was shown 
that these ESTs identified a novel unc-53 designated 
Hs-unc-53/3 (Fig. 1 e; f ) . The publication of the 

35 sequence AB023155 (Nagase et al . 1999, DNA Res. 6:63- 
70) independently confirmed the correctness of the 3'- 
end of Hs-unc-53/3 as well as the existence of one new 
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intron that forms the 5' -end of AB023155. Alignments 
of the C. elegans and 3 human UNC-53 sequences (fig. 
2) clearly illustrates that the third human homologue 
of C. elegans UNC-53 protein is a bona fide UNC-53 
5 with highest similarity to Hs-UNC-53/2 and in 

decreasing order to Hs-UNC-53/1 and ( C. elegans UNC- 
53) Ce-UNC-53. 

Many of the domains of Hs-UNC-53/3 show highest 
similarity to functional domains of other animal UNC- 

10 53s (fig. 2) . This critically suggests that Hu-UNC- 
53/3 most likely has the key functionalities observed 
for Ce-UNC-53 in a variety of assays including F-actin 
binding, F-actin reorganisation in cell culture, 
microtubule and microtubule (+)-end binding in 

15 cultured cells, binding of SH3-domain adapters like 

SEM-5/GRB-2 or other types of binders of proline rich 
alpha-helices. These results indicate that like Ce- 
UNC-53, Hs-UNC-53/1, Hs-UNC-53/2, or Hs-UNC-53/3 can 
be used in a range of biochemical, cellular and animal 

20 assays aimed at discovering tissue- or disease- 
specific modulators of Hs-unc-53 functioning in 
diagnostic assays . 

Further extension of the Unc-53 family (Fig. 11, 
25 12) 

Database searches with the three human UNC-53 
protein sequences revealed several expressed sequence 
tags (ESTs) and genomic DNA sequences (BACs) that show 
30 significant similarlity to human UNC-53. 

C. briggsiae 

The C. elegans genome consortium sequenced the 
35 locus of the C. briggsiae unc-53 homologous gene. 
Through gene prediction programs and 

the cDNA sequence of the C. elegans unc-53, prediction 
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can be made for the C. briggsiae protein sequence. 
Alignment of the derived C. briggsiae 
amino acid sequence with the C. elegans amino acid 
sequence in figure 3 demonstrates the strong homology 
of both proteins. 

D. melanogaster 

BAC clone BACR48M05 (AC005719) clearly contains 3 
different exons with high homology to Hs-unc-53/3 
(Figure 11) . Using the gene structure prediction 
program Fgene [Solovyev et al., 1995, in: Proceedings 
of the Third International Conference on Intelligent 
Systems for Molecular Biology (eds. Rawling et al . , 
Cambridge, England, AAAI Press) ; Solovyev and 
Lawrence, 1993, in: Abstracts of the 4th annual keck 
symposium. Pittsburgh, 47) it was possible to predict 
an ORF encoded by BAC clone BACR48M05 that shows 
homology to Hs-unc-53/3 (Figure lib) . However, every 
Drosophila cDNA partially or entirely encoded by BAC 
clone BACR48M05 and which contains one or more 
sequence blocks as indicated in figure 11a should be 
considered as a family member of the UNC-53 family. A 
"BLAST 2 SEQUENCE " search indicates that the sequence 
situated between the three homology blocks that are 
indicated in figure 11a is less conserved between 
human and Drosophila (Figure 11c) . The predicted ORF 
of the Drosophila melanogaster UNC53 gene can be used 
to identify new members of the family. The zebrafish 
EST fc21d06 (AI658309) shows an identity of 84% and a 
homology of 92% x to Hs-UNC-53/2. It clearly can be 
considered as a part of the zebrafish homologue of Hs- 
UNC-53/2 (Figure 12) . Finally, a whole series of 
human ESTs have been placed in public domain 
databases. To our knowledge, no one has been able to 
place these ESTs into contigs that describe a true Hs- 
unc-53 to a level presented in this specification. 
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The presently available unc-53 sequences - expressed 
or genomic - further underscore that the unc-53 gene 
family is a true animal gene family in helminths, 
vertebrates and arthropods, three major classes of the 
5 animal kingdom. 

Refined UNC-53 family description based on 
alignment (fig. 4) . 

10 The alignment of the three human and the 

eleaans UNC-53 sequences enables the more refined 
definition of conserved regions in UNC-53s. In figure 
4 there are compiled a number of proSite signatures 
for either the four animal or the three human UNC-53s. 

15 

Differential expression of Hu-UNC-53/3 by 
Northern blot (fig. 5). 

To determine in which cells and tissues the 
20 vertebrate UNC-53s play a role, a northern blot 

analysis has been performed. As indicated in the 
experimental section, relevant probes were amplified 
and used to visualise in which normal human tissues 
and in which cancer cell lines the three human UNC-53s 
25 were expressed. 

1. A cancer cell line RNA blots probed with Hs- 
Unc53/1. 

A Northern blot of poly-A+RNA from several 
30 cancer cell lines (Melanoma G361, Lung Cancer A549, 
Colorectal Adenocarcinoma SW480, Burkitt Lymphoma 
DRajii, Leukemia Molt4, Lymphoblastic Leukemia K562, 
HeLa S3 and Promyelocytic Leukemia HL60) was probed 
using the whole insert of pHH3b. No or weak 
35 expression was detected in the Burkitt Lymphoma 
DRajii, the Leukemia Molt4 and the Promyelocytic 
Leukemia HL60 cell lines. Five different transcripts 
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are detected in the remaining cancer cell lines: 
transcripts 1 and 2 are larger than 9.5kb, transcripts 
3 and 4 are 6 to 7 kb and the fifth transcript is 
around 6 kb. Transcripts 1 and 2 are present in all 
5 expressing cell lines but at different levels. 

Transcripts 3 and 4 are restricted to Melanoma G361, 
Lung Cancer A549 (weak) and Colorectal Adenocarcinoma 
SW4 8 0 and are the predominant transcripts in Melanoma 
G361 and Colorectal Adenocarcinoma SW480. Transcript 
10 5 is restricted to Lymphoblastic Leukemia K562 (weak) 
and (predominant) in HeLa S3 and is predominant in 
HeLa S3. 

2. Cancer cell lines RNA blots probed with Hs- 
15 Unc53/2. 

A similar set of cancer cell line Northern 
blots were probed with a 652bp fragment of EST46037 
amplified by using the primers 5'- 

aggagatgaagctgacagatatcc and 5' -aaacaccagtgagtcc . Hs- 
20 Unc53/2 is expressed in Melanoma G361, Colorectal 

Adenocarcinoma SW4 80, Lymphoblastic Leukemia K562 and 
HeLa S3. No expression was detected in Lung Cancer 
A549, Burkitt Lymphoma DRajii, Leukemia Molt4 and 
promyelocytic leukemia HL60. Interestingly only 2 
25 transcript sizes were detected of around 7 kb 

expressed in Lymphoblastic Leukemia K562 and HeLa S3 
and a transcript of >9.5 kb in Melanoma G3 61 and 
Colorectal Adenocarcinoma SW480 and weakly in HeLa53. 
Noteworthy is the very high expression in melanoma 
30 G3 61. 

3. Normal Human tissue probed with Hs-Unc53/1. 
A Northern blot of poly-A+RNA from normal 

human tissue was probed using the whole insert of 
35 phage HH3b . Expression levels are low in all tissues 
with the highest level in heart and placenta, several 
fold lower levels in brain and testis, even lower 
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levels in skeletal muscle/ pancreas, thymus, colon, 
small intestine, ovary and prostate. Expression in 
peripheral blood leukocyte, lung, liver, kidney, 
spleen is barely detectable. 

5 

4. Normal Human tissue probed with HS-UNC53/2. 
A similar set of blots were probed with a 
652bp fragment of EST46037 amplified by using the 
primers 5' aggagatgaagctgacagatatcc and 5'- 

10 aaacaccagtgagtcc . Expression levels are low in all 

tissues with the highest level in kidney, placenta and 
pancreas, lower levels in heart and lung. Expression 
is barely detectable or undetectable in skeletal 
muscle, spleen, thymus, prostate, testis, ovary, small 

15 intestine, colon peripheral blood leucocyte, stomach, 
thyroid, spinal cord, trachea, adrenal gland and bone 
marrow. Also Hs-unc-53/2 appears to be expressed as 
different transcripts (figure 5a) . 

The hs-UNC53/l and hs-UNC-53/2 homologues are 

20 clearly highly regulated genes, showing a strong 
tissue specificity and, probably, additional 
mechanisms of regulation (ie differential splicing of 
different promoters) . The different proteins derived 
from RNA's identified by probe hhl5 presumably share 

25 the carboxyterminal nucleotide binding domain. 

Ce-UNC-53 was shown to be a complex genetic locus and 
complex transcription unit. The different transcripts 
are thought to be a mechanism to assure the necessary 
specificity and functional diversity of this signal 

30 transduction pathway, with respect to different 

signals and receptors, different tissues and different 
directions of migration. The occurrence of a new 
transcript or the observed changes in expression 
levels in the cancer cell line blot suggests a role 

35 for hs-UNC-53/3 in the establishment or maintenance of 
the transformed state of those cells. 
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Expression pattern of hs-UNC-53/3. 

A northern blot of poly-A+RNA from several cancer 
lines was probed with unique fragments of the three 
5 genes from the Hs-unc-53 family. Hs-unc-53/3 has a 
high expression level in lung carcinoma line A549, 
where only a moderate expression of hs-unc-53/1 has 
been detected. Furthermore, moderate expression of 
Hs-unc-53/3 was also observed in melanoma line G361, . 
10 where previously, a high expression of hs-UNC-53/1 and 
hs-UNC-52/2 has been observed. This indicated the 
involvement of hs-unc53/3 in at least two cancer 
lines . 

In normal human tissues, the expression of hs- 

15 unc-53/3 shows a clearly new and previously unobserved 
expression pattern. This difference of expression of 
hs-unc-53/3 in relation to its homologues hs-unc53/l 
and hs-unc53/2 is important for the allocation of 
functionality to hs-unc-53/3. 

20 Hs-unc-53/3 is highly expressed in brain, as 

shown on the Northern blots (figure 5a) . In figure 5b 
it can be seen that Hs-unc-53/3 also is differentially 
expressed in different parts of the brain. Its 
homologues are not or weakly expressed in brain. This 

25 gives an indication that its function in 

directionality of cell migration and growth cone 
steering will be in relation to specific regions or 
cells of the brain. It is deduced that Hs-unc-53/3 
will be an important signal transducer or signal 

30 adapter linking signals to neuronal outgrowth, axon 
guidance, and formation and maintenance of synaptic 
connections. It seems that the function of Hs-unc- 
53/3 will be associated with neuron-neuron 
interactions, neuronal outgrowth, neuron muscle 

35 interactions, and post-synaptic signal transduction. 
Furthermore, Hs-unc-53/3 may be involved in the 
development of cancer of neuronal origin, like 
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neuroblastomas, or the development of tumours will 
have their developmental origin in the brain as some 
eyes diseases like retinoblastomas. 

The significance of the high expression of Hs- 
5 unc-53/3 in brain tissue can be associated with the 

high levels of expression which has also been observed 
in the spinal cord, containing neuronal tissue. Here, 
neuronal (axon) outgrowth and neuron-neuron 
connections are of importance. Development of 

10 pharmacological tools acting on this pathway may lead 
to treatments of diseases involved in the growth and 
movement of neuronal cells, and the regeneration of 
neuronal connectivity after trauma, or the inhibition 
of neuronal cancers such as neuroblastomas. Due to 

15 its specific expression, inhibitors and/or enhancers 
specific for Hs-unc-53/3 will have an advantage as a 
pharmaceutical compound over more general compounds 
acting on the Hs-unc-53 family of genes and proteins. 
A second tissue where hs-UNC-53/3 is highly 

20 expressed and where (its) other human homologues are 
not expressed is the spleen. Hs-UNC-53/3 could 
therefore function as part of the signal transductions 
pathway involved in the maturation of leukocytes. 
Malfunction of this pathway may lead to incorrect 

25 maturation of the leukocytes and the development of 
autoimmune diseases such as rheumatoid arthritis and 
sclerosis. Next to the signalling function in the 
recognition of the leukocytes, Hs-UNC-53/3 may also 
play an important role in the induction and/or 

30 signalling pathway of the mechanism underlying 

apoptosis of leukocytes in the spleen. Pharmaceutical 
methods involving the hs-UNC-53/3 pathway, which may, 
for example, result in an inhibition and/or 
enhancement of its expression may lead to treatment of 

35 these disorders. Furthermore, hs-UNC-52/2 may have an 
advantage, as an inhibitor or enhancer specific for 
hu-unc53/3 which will act in a more specific manner. 
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The Hu-UNC-53/3 protein is also highly expressed 
in the ovary, where the two other human homologues are 
also expressed. Finally moderate to low expression of 
hs-unc53/3 is observed in heart, placenta, testis, 
5 stomach and adrenal gland. 

Although the predominant transcripts of Hs-unc- 
53/3 are > 9 kb, often a smear occurs that ends at 
with somewhat higher intensity at 5.5 - 6.5 kB. This 
short transcript may correspond to AB023155. 

10 The Hs-unc53/3 gene is a highly regulated gene, 

showing strong tissue specificity and additional 
mechanisms of regulation which have not previously 
been identified in any of its known homologues. These 
findings may thus lead to the development of more 

15 specific inhibitors or enhancers of hs-UNC-35/3 and or 
of the Hs-UNC-53/3 pathway. The Northern blot studies 
indicate that the three human unc-53s are complex 
transcriptional units with highly regulated tissue 
specificity and that transcripts of different lengths 

20 exist. 

Splice variants of human unc-53s 

Whilst cloning Hs-unc-53/3, it became apparent 
25 that at least three expression variants of Hs-unc-53/3 
- most probably alternative splices - exist (fig. le, 
f; lowercase regions) . Targeted efforts for the two 
other human UNC-53s demonstrated that the other human 
UNC— 53s contained variants (fig. la, c and e regions) . 
30 Splice variants as observed to date appear to be 

concentrated in specific regions. A first one 
(starting at position 1252 in fig. 2) - in which the 
overall amino acid similarity is weak - contains 2 
(splice) variants of both Ce-unc-53 and Hs-unc-53/3. 
35 In the worm, the presence or absence of these 2 exons 
in unc-53 regulates the function of the UNC-53 protein 
in such a way that cells differentially translate 
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extra-cellular signal gradient as an attractive or 
repulsive signal. The most 3' -variant of Hs-unc-53/2 
roughly covers the 2 Ce-unc-53 variants. 

The complexity of variation in this zone of Hu- 
5 UNC-53 might resemble the situation in the nematode. 

In Hs-unc-53/3, for example, the region from position 
3795 to 4325 (figure le) consists of two adjacent 
blocks (3795 to 4283 and 4286 to 4325 in figure le) 
that can independently be present in or absent from 

10 cDNAs from frontal cortex tissue. In contrast, no 

variants were as yet observed in this zone for Hu-UNC- 
53/1 or /2. 

The second variant in Hs-unc-53/3 (fig. 2) 
deletes a box (MQLDNRTLPKKGLR) , which is extremely 

15 conserved (in bold) among all human unc-53s. This 
occurrence of this variant could indicate 
differentially active functional variants of Hu- 
unc53/3 . 

A second region in which splice variants were 

20 observed contains a major highly conserved domain of 
unc-53s. Hs-unc-53/1 has a first variant that 
comprises the most N-terminal portion of this 
conserved domain (SGSFRD) . A second splice variant in 
Hs-unc-53/1 (AEERMOSE) lies within the highly 

25 conserved domain. Another conserved spot for splice 
variation in human unc-53s has been found (figure 2) : 
Hs-unc-53/1 { VYE } ; -/2 { VNE } and -/3 {NSRGSEL} . All 
these spliced exons are flanked by two conserved 
charged domains - putative nuclear localisation 

30 signals. Given this conservation, we searched for 

splice variation in C. elegans and found it to exist 
in the form of an extra exon (ALSVDSQ) (figure 2) . 
Hu-unc-53/3 has another variant 
( S P L VW P PKKRQNG P V I YKH S R ) (fig. 2) . 

35 The most 3' splice variant in Hs-unc-53/3 has 

been discovered whilst cloning Hs-unc-53/3 and was 
shown to be present uniquely in human heart cDNA 
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libraries . 

Single nucleotide polymorphisms 

5 Cloning and PCR studies indicated the existence 

of a non-silent single nucleotide polymorphism in Hs- 
unc-53/1 in position 1232 and in Hs-unc-53/2 in 
position 929. This indicated that variations exist in 
human unc-53s which - in some cases - may be relevant 
10 to the proper functioning of the UNC-53 protein and 
hence in disease. 

Expression in normal and neoplastic cells by RT- 
PCR 

15 

The cloning efforts demonstrated the existence of 
splice variants in the human unc-53s and the Northern 
blots revealed a range of transcripts for each human 
unc-53. The combined data do not explain completely 

20 the range of transcripts observed. Therefore, our 
understanding of the expression complexity of human 
unc-53s may be incomplete and more detailed RT-PCR 
studies were performed. 

One of the obscuring factors could have been that 

25 all studies performed on mRNA or cDNA of whole tissues 
which are built of different normal human cell types 
that occur in different proportions. For this reason 
and because skin was not covered in the Northern blot 
studies, a RT-PCR study was set up using cDNA 

30 preparations of the different cells in skin normal 

human: (1) epidermal keratinocytes, (2) melanocytes, 
(3) dermal fibroblasts. In addition, lineage matched 
transformed cell lines or tumour cell lines were 
included in the study to compare normal versus 

35 neoplastic cells. Human umbilical vein endothelial 
cells (HUVEC) were taken as a normal human match for 
endothelial cell lines. 
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The RT-PCR study for Hs-unc~53/l revealed that 
the most 5' -splice variant is differentially expressed 
in normal versus neoplastic cells/cell lines. This 
exon is present in 7/7 keratinocytes, HUVEC and in 
5 melanocytes but lacking in HaCat, ECV304, 2/7 melanoma 
and MCF-7 cells (breast carcinoma) . 

The RT-PCR study for Hs-unc-53/2 revealed a more 
surprising picture. The tumourigenic endothelial line 
ECV304 lacks expression of Hs-unc-53/2, whereas their 

10 normal counterpart HUVEC expresses Hs-unc-53/2, 

suggesting gene deletion or inactivation of expression 
in ECV304. In epidermal keratinocytes and the lineage 
matched spontaneously transformed keratinocyte HaCaT 
and MCF-7 lack expression of the 5' -end of Hs-unc- 

15 53/2, but express the 3' end (starting in or near the 
microtubule-binding domain) . This suggests that like 
AB023155 for Hs-unc-53/3, also Hs-unc-53/2 can be 
expressed as a truncated 3' -variant in a cell-specific 
way. Also splice variation of Hs-unc-53/2 appears to 

20 differ in a normal to neoplastic way: the { VNE } exon 
was shown to be present in all keratinocyte isolates 
but not in HaCaT and also melanocytes express it, but 
not 2/7 melanoma or MCF-7. The RT-PCR studies for Hs- 
unc-53/3 were focussed on demonstrating expression of 

25 AB023155 in tissues other than brain. The new exon 
described was shown to be present in keratinocytes, 
HUVEC, dermal fibroblasts, melanocytes and their 
transf ormed/neoplastic variants, demonstrating its 
wide expression in tissues in man. 

30 

Alternative 5' -start exons 



For Hs-unc-53/2 five different start exons have 
been cloned using RT-PCR, three of which have been 
35 confirmed to be present in at least 2 different cDNA 
libraries (figure lb, c) . Likewise for Hs-unc-53/3 
different 5' -exons were found, two of which were 
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confirmed (figure le, f ) . These 5'-exons most 
probably indicate that human unc~53s are being 
expressed via the control of alternative promoters 
that lie 5' of these different 5'-exons. Also in the 
5 nematode has been shown that different (intronic) 

promoters are driving the expression of 5' -variants of 
C. elegans unc-53. 

The Hs-unc-53/1 5' -end 

10 

Despite considerable efforts, cloning has not 
lead to the identification of a bona fide 5' -end for 
Hs-unc-53/1 that comprises an F-actin binding domain, 
despite the fact that the Northern blots indicate the 
15 existence of transcripts > 9.5 kb. Given that both 
Hs-unc-53/2 and -/3 are expressed as full length and 
truncated forms, the question can be raised whether 
Hs-unc-53/1 may not be expressed in a short form as 
well . 

20 cDNA library cloning and 5' -RACE has provided 

contiguous sequence that ends at a position that 
matches with a domain in C. elegans un-53, where an 
alternative start position lies. Based on this 
argument, Hs-unc-53/1 could be a functional equivalent 

25 in man of this transcript in nematode. 

To further trace the "longer" variants of Hs-unc- 
53/1, genomic BAC DNA sequencing has been performed. 
In figure lg is shown sequence of a4984 fragment from 
BAC 585E09. It comprises sequence 5' of the presently 

30 known cDNA of Hs-unc-53/1. To the qualified as well 

as by means of two groups of gene structure prediction 
computer programs, different but comparable exons in 
the 4 984 bp genomic sequence fragment can be predicted 
(figure 14) . The programs GENSCAN, HEXON and MZEF all 

35 predict an exon between bp 1089 and bp 1880. The end 
of this predicted exon (bp 1880) is confirmed by the 
cDNA sequence. Therefore this predictions has a big 
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change to indicate the correct exon length. The 
programs GRAIL, GENEFINDER and HMMGENE all predict an 
exon between bp 1123 and bp 2031. None of the 
predicted exons contains an in frame stop codon 5' of 
5 the alternative start codon. Consequently, it is 

possible that there exist unidentified exons 5' of the 
exon containing the alternative start codon. 

The present picture critically suggests that both 
nematode and human unc-53s appear to be complex 

10 transcriptional units. Moreover, the fact that some 
of the most complex splice variants map to similar 
regions in the UNC-53 proteins points to evolutionary 
conserved functional variants of UNC-53s e.g. with 
regard to the cells directional migration towards or 

15 away from a signal source. In contrast, some of the 
variants in the human UNC-53s are located in highly 
conserved domains; these (and other) variants may 
create discrete - yet undiscovered - functionally 
different UNC-53 proteins transcribed from one of the 

20 unc-53 genes. 

The fact that two and maybe three human unc-53s 
exist as full size and a truncated forms with cell- 
specific expression, that series of alternative 5'- 
start exons exist eventually controlled by different 

25 promoters that some forms of splice variation are 

conserved from nematode to man, all indicate that the 
expression of unc-53s is of very high complexity and 
that some of the biological functions of UNC-53 
proteins are extremely conserved. 

30 On the other hand, the differential expression in 

Northern blots, the splice variation difference 
between normal and lineage-matched neoplastic cells 
and the non-silent single nucleotide changes in two of 
the three human unc-53s, all indicate how important a 

35 wide range of diagnostic assays can be to understand 
in depth the role in disease of human unc-53s. 
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Chromosomal localization of Hs-unc-53/2 by 
Genemap98 (Fig. 13 and 1(c)) 

The EST clones AA918601, AI248585, AA115014 and 
5 AA115015 are clearly homologous to the 3'-UTR of Hs- 
Unc-53/2 cDNA (Figure 1(c))). Although, AA115014 
(describing the same EST as AA115015) contains an 
alternative splice variant of the Hs-Unc53/2 gene in 
the 3'UTR. A survey with ESTs AA918601, AI248585, 

10 AA115014 or AA115015 as query in the genemap98 

database (release November 1998) revealed that the Hs- 
Unc53/2 gene is located at chromosome 11 
(http^/ /www.ncbi .nlm.nih . gov/genemap98/loc . cgi?ID=2122 
4) . The STS which is used for chromosomal 

15 localization and which is situated in the 3'UTR of the 
Hs-Unc53/2 gene is referred to as SHGC-33456 (dbSTS 
Id: 41891, Genbank Acc: G28036, Genbank gi : 1396755) 
(Figure 13a) . The STS was localized by analysis on 
the NIGMS human/rodent somatic cell hybrid panel 

20 (dbSTS Id: 41891) . The Radiation hybrid results are 
summarized in Figure 13b. Together these data imply 
that every disease or phenotype connected to SHGC- 
33456 is due to the Hs-Unc-53/2 gene. 

25 Functional Characterisation of Hs-unc-53/3 

F-actin reorganisation and microtubule binding of 
Hs-unc-53/3 

30 Based on its structural features, Hs-unc-53/3 can 

be classified as a bona, fide human unc-53. To further 
understand its function and in anticipation of 
developing pharmacological compound screening assays, 
Hs-unc-53/3 has been physically cloned following the 

35 method described in the experimental section and shown 
in figure 7a. The derived Hs-unc-53/3 clones 
comprising full length (A to L and the 3' -half (G to 
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L) of Hs-unc-53/3 were further engineered to form a 
chimera with green fluorescent protein and cloned into 
expression vectors appropriate for transfection of 
eukaryotic cells. The nucleic acid and amino acid 
5 sequences of these constructs are shown in figure 7b- 
e. The constructs were transfected into cells and 
scored for their effects on the F-actin cytoskeleton 
and binding to microtubules of mouse neuroblastoma 
cells N4; functions known for nematode unc-53 and 

10 human unc-53/1. 

The N4 cell transfected with a GFP fusion to the 
3' -half of Hs-unc-53/3 (pGI3303, fig. 7b) showed 
pronounced filopodia and lamellipodia outgrowth, which 
is associated with reorganization of the F-actin 

15 cytoskeleton (Figure 8) . This observation 

demonstrates that like nematode unc-53 and human unc- 
53/1, the F-actin binding domain is not required for 
inducing reorganization of the F-actin cytoskeleton of 
N4 cells. In addition, the pGI3303 encoded fusion 

20 protein does not co-localize with microtubuli but 

localizes to the cytoplasm of N4 cells indicating that 
an important domain for microtubuli association is 
missing in this C-terminal fragment of Hs-unc-53/3. 
In the alignment figure 2 can be seen that the C- 

25 terminal half of Hs-unc-53/3 (approximate KIAA0938) 
does not comprise the conserved microtubule binding 
domain. 

In contrast, the N4 cells that expressed low to 
medium levels of the GFP fusion to full length Hs-unc- 

30 53/3" (pGI3305, Fig. 7d) displayed a co-localization of 
the GFP fusion protein with microtubules (Figure 9) . 
Even the centrosomes could clearly be detected in some 
transfected cells. Cells expressing very low amounts 
of the fusion protein displayed specific microtubule 

35 (+)-end binding (Figure 9). The morphology of the 

pGI3305 transfected N4 cells does not clearly differ 
from the control transfected cells although there is a 
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tendency towards rounding up of the pGI3305 

transfected cells and filopodia outgrowth. 

Validation of functional assays as compound 
5 screens 

R74288 has previously been shown to be an 
inhibitor of nematode function in C. elegans 
(W096/38555) , an activity that has been confirmed in 

10 Ce-unc-53 transfected N4 cells, where only the 

transgene-induced effect was inhibited by R74288. In 
order to confirm compound R74288s activity in a full 
mammalian system, a stable transfection of plasmid 
pGI3150 was performed in the N4 neuroblastoma cell 

15 line with the lipofectamin procedure (Gibco BRL) . 

pGI3150 expresses an eGFP protein in fusion with the 
Oterminal end of Hs-unc-53/1 (see Figure 15a) . After 
two weeks of G418 selection, 20 clones with stable 
integration of the pGI3150 plasmid were selected and 

20 isolated. These clones were tested for GFP expression 
by fluorescence microscopy and by Western blotting 
with an anti-GFP antibody (table 1) . The lamellipodia 
outgrowth phenotype was checked visually (See Figure 
15b) . Compound R74288 was tested on four random 

25 selected pGI3150 stably transfected clones: 8.1, 8.2, 
8.3 and 10.1 and on a pool of pEGFPCl stable 
transfected N4 control cells. Clones 8.2 and 10.1. 
displayed less lamellipodia outgrowth than clones 8.1 
and 8.3. Compounds and solvents were added to the 

30 stably transfected cells (10 5 M in DMSO) . After 24 hrs 
of incubation, two persons independently scored the 
effect of the treatments on the cells. As shown in 
table 1, both persons noticed an effect compound 2 on 
clones 8.2 and 10.1 with a weak transgene-induced 

35 lamellipodia phenotype. This effect consisted of a 
more flat morphology of the treated versus untreated 
cells. Compound 2 was R74288. 
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Table 1. Effect of compounds on lamellipodia 
formation 



Clone 


Compound 


Compound 


Compound 


Compound 


GFP 


GFP 


Phenotype 




1 


2 


3 


4 


fluo 


Western 




8.1 


0 


0 


0 


toxic 


+ 


+ 


+ 


8.2 


0 


+ 


0 


toxic 


++ 


+++ 


+ /- 


8.3 


0 


0 


0 


toxic 


++ 


++ 


++ 


10. 1 


0 


+ 


0 


toxic 


+/- 


+ 


+/- 


GFP pool. 


0 


0 


0 


toxic 









10 

Automated compound screening by measuring cell 
morphology 

15 Compound screening assays must have a 

sufficiently high throughput to be relevant to drug 
discovery. To achieve this goal, we automated the 
procedure of measuring the morphological changes 
induced in cells following transient transfection with 

20 full length or 3' -half of Hs-unc-53/3 GFP chimeras. 

The cell culture, transfection, fluorescence staining 
and microscopy procedures are performed within a 96- 
well plate (all-in-one) . The fluorescent staining 
method comprises a triple fluorescent labeling 

25 procedure (1) for cell nucleic using DNA double helix 
intercalating dyes such as Hoechst 33342 or DAPI, (2) 
for transfection efficiency and expression level of 
the chimeric protein using GFP fluorescence and (3) 
for the F-actin cytoskeleton using f luorescently 

30 labeled phalloidin, a microfilament dye. 

These three different fluorescent images are 
collected using an motorised stage plus stage driver 
and a frame grabber that produces seamless composite 
images of the cells in the well. The software 

35 programs to drive this operation are known in public 
domain as "SCIL" (University of Amsterdam) . The 
seamless images are then superimposed using 
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pseudocolour for the operator to inspect the quality 
of the culture. In addition, the SCIL program was ■ 
compiled in such a way that it: (1) identifies cells 
by means of their nucleus, (2) measures the GFP 
5 fluorescence intensity, (3) delineates the area of the 
F-actin (phalloidin) staining surrounding a nucleus and 
(4) calculates a range of parameters objectively 
representing the features of the F-actin staining 
pattern of each individual cell. One example of such 

10 a parameter is called the "form factor" . It is an 
arbitrary value that reflects the dendricity of a 
cell. It is derived by calculating (A) the true 
circumference of a cell's F-actin staining area as 
seen in the image and (B) the area of the F-actin 

15 staining of that given cell. The ratio 4xPIx(B) 2 = the 
form factor. For a rounded cell, the form factor 
approximates 1 whereas, for a cell with increased 
filopodia and lamellipodia outgrowth, the true 
circumference will be much larger than that of a 

20 circle and as a result, the form factor « 1. 

In experiments it was shown that transiently 
transfected N4 cell populations indeed displayed a 
different form factor versus control cells. Both the 
median and average form factor for a cell population 

25 in a well were reduced following transfection with the 
3' -half of Hs-unc-53/3. More in particular, there was 
a significant decrease in the number of cells in a 
transfected culture that displayed the minimal form 
factor, suggesting that the Hs-UNC-53/3 transgene 

30 induced round cells in particular to become more 
dendritic (figure 16) . 



Chromosomal localisation of Hs-unc-53/3 by FISH 
indicative for a role disease 

35 

With FISH technology using a unique fragment of 
hs-unc-53/3 we are able to localize the hs-unc53/3 
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gene on chromosome 12q21.1. Chromosome 12q21.1 is a 
region shown to be involved in autosomal dominant, 
cornea plana and closed angle glaucoma (Sigler- 
Villanueva et al., Ophthalmic Genetics 18:55-62, 
5 1997) . This indicates that hs-UNC-53/3 protein may be 
involved in eye development and thus eye diseases, 
such as retinoblastomas. Neuroblastoma cell line NPG 
and liposarcoma line WDLPS and other sarcoma lines 
have amplifications in this region. The neuroblastoma 

10 amplification seems to be located more distal (12q24) 
while the liposarcoma line is located at 12q21 (Van 
Royal et al . , Cancer Genetics and Cytogenetics 82:151- 
4, 1995). Three loci related to Darier' s disease, an 
autosomal dominant genodermatosis disease 

15 characterized by epidermal acantholysis and 

dyskeratosis have been mapped in region 12q21-q24 
(Wright et al., Journal of investigative Dermatology 
103:665-8). 12q21 is also known to be a fragile site 
associated with the pathogenesis of non-Hodgkin' s 

20 Lymphoma (Chary-Reddy et al . , Cancer Letter 86:111-7 
1994). Duplications related to nephroblastoma 
tumorgensis were commonly found in the 12q21-q23 
region (Austruy et al . , Genes Chromosomes Cancer 
14:285-294, 1995). In a girl with mental retardation, 

25 a conclusive disorder and clinical findings resembling 
cerebral palsy, positioning of segments from other 
autosomes adject to the band 12q21 were found 
(Biederman et al., Ann Genet 19:257-260, 1976). 
Cytogenetic analysis for myeloid leukemia showed a 

30 complex caryotype with chromosomal breakpoints at 

12q21 (Weinstein et al . , Cancer Genet Cytogenet 48:75- 
81, 1990) . Finally, analysis of complex chromosomal 
rearrangements in malformed children and from 
spontaneous abortions showed specific breakpoints at 

35 site 12q21 Gorski et al., Am J Med Genet 29:247-261, 
1997) . Most of these diseases have been shown to be 
involved with cell movement, aberrant development, or 
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cell-cell contact and neuronal tissue or neuronal 
development . 

Confirmation of FISH with Radiation hybrid panels 

5 

To confirm and refine the chromosomal 
localisation of the human unc-53s an alternative 
method for FISH has been used. Radiation hybrid (RH) 
mapping is a somatic cell hybrid technique that was 

10 developed to construct high-resolution, contiguous 

maps of mammalian chromosomes. RH mapping provides a 
method for ordering DNA markers spanning millions of 
base pairs of DNA at a resolution to easily obtained 
by other mapping methods. Some of the advantages of 

15 RH mapping are (1) distance estimated by this method 
is directly proportional to physical distance, (2) 
nonpolymorphic DNA markers, that can not be used for 
meiotic mapping, can be used for this method, and (3) 
a high resolution map that is not easily made by other 

20 methods can be obtained. 

The results of FISH and RH mapping for the three 
human unc-53s are summarised in table AA. By using 
publicly available databases (see experimental 
section) one can derive information on the correlation 

25 between FISH and RH mapping. RH Mapping was . shown in 
this way to confirm the FISH data for the three unc- 
53s . 
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Table 2. RH Mapping Primers and Results 



Unc-53 


FOR 
Primer 


REV primer 


PCR Results 


Harker* 


FISH 


Hs-UNC-53/1 
(BAC585E9) 


5 ' TGTGGGT 
GAGGAATGC 
TGAC 


5 ' CAGAGCTT 
GCTCTAGAGG 
AC 


51, 62, 66 


SHGC-30236 


lq31-32 


Hs-unc-53/1 
(BAC585E9) 


5 ' CCTGCCC 
AACATAGCA 
AGAC 


5 ' CCATCTAC 
AATGAGCCAG 
AC 


51, 62, 66 


SHGC-30236 


lq31-32 


Hs-unc-53/2 
G411 


5'CTGCCTC 
CCTTTGCTG 
TGTTGCATG 


5 ' CTGAGCAG 
AGTGAAGCCA 
GAGTTGG 


8, 28, 29, 43, 
44, 51, 59, 
66, 70, 77, 83 


AFM022th2 


llplS.t 


Hs-unc-53/2, 
F4 . 1 . 2 


5 ' TCATGTA 
TTCCCCACA 
GACAAGCC 


5 ' CATTGTGT 
CTTGATACTT 
TGGGGTGC 


8, 28, 44, 51, 
59, 65, 83 


SHGC-31021 


llpl5.1 


Hs-unc-53/2, 
D4.1.1 


5 ' GAGGATT 
TTATTTCT.G 
GGAAATGGA 
ATCGG 


5 ' TGATCTTC 
CACTCCGTGG 
ATAACT 


8, 27, 28, 29, 
43, 44, 51, 
59, 65, 70, 83 


AFM022th2 


llpl5. 1 


Hs-unc-53/2, 
J4.1.4 


5'AAAGCCC 
AAGCCCCGG 
GAGAAGATG 


5 ' AACCCGTT 
TTCCAC C GAG 
CCGCTC 


8, 27, 28, 43, 
44, 51, 59, 
66, 70, 83 


AFM022th2 


llplS.l 


Hs-unc-53/3, 
A215 


5 ' ACTTGCT 
GAAACAGAG 
AGCTCCATG 


5 ' CTTGCTGT 
CTTCTTTCTC 
CTTGGC 


1, 48, 50, 51, 
59, 65, 66, 
73, 74, 76, 78 


SHGC-17536 


12q21.1 


Hs-unc-53/3, 
A211 


5 ' TGATCTT 
CTAGCGTGT 
GACTCACTG 


5'ATCATTCC 
TTGGAGT 


1, 48, 50, 51, 
59, 73, 76, 78 


SHGC-17536 


12q21.1 



0 (*) list not exhaustive 

Also sequence information available in public 
domain can help refine the positioning of the unc-53 
genes, like in the following example. The EST clones 

5 AA918601, AI248585, AA115014 and AA115015 are clearly 
homologous to Hs-Unc53/2 cDNA. Although, AA115014 
(describing the same EST as AA115015) contains an 
alternative splicevariant of the Hs-Unc53/2 gene in 
the 3'UTR. A survey with ESTs AA918601, AI248585, 

0 AA115014 or AA115015 as query in the genemap98 

database (release November 1998) revealed that the Hu- 
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unc53/2 gene is located at chromosome 11 
(http: //www.ncbi .nlm.nih. gov/genemap98/loc . cgi?ID=2122 
4) . The STS which is used for chromosomal 
localization and which is situated in the 3'UTR of the 
5 Hs~Unc53/2 gene is referred to as SHGC-33456 (dbSTS 
id: 41891, Genbank Acc: G28036, Genbank gi : 1396755) 
(Figure 13) . The STS was localized by analysis on the 
NIGMS human/rodent somatic cell hybrid panel (dbSTS 
id: 41891) . The radiation hybrid results are 
10 summarized in Figure 13. Together these data imply 

that diseases or phenotypes connected to SHGC-33456 is 
due to the Hs-Unc53/2 gene. 



15 



EXPERIMENTAL PROCEDURES 

Cloning & sequencing of" Hs-unc-53/3 



Hs-unc53/3 has been cloned starting from a series 
of ESTs that were similar but not identical to Hs-unc- 
20 53/1 or -/2. The ESTs were: 

1. WashU-Merck EST 767735. 

Transformed cells carrying the EST 767735 
25 sequence were ordered from Research Genetics. Plasmid 
DNA was isolated using standard protocols (Qiagen 
plasmid DNA isolation kit) , the sequence of the insert 
was determined. 

30 2. ATCC cDNA clones 86459. 

Transformed cells carrying the cDNA clone 
86459 sequence were ordered from ATCC. Plasmid DNA 
was isolated using standard protocols (Qiagen plasmid 
35 DNA isolation kit) , the sequence of the insert was 
determined. 
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3. Genethon cDNA clone c09a03 from the 
Geneexpress cDNA program. 

Transformed cells carrying the cDNA clone 
5 c09a03 sequence were ordered from Genethon, Plasmid 
DNA was isolated using standard protocols (Qiagen 
plasmid DNA isolation kit) , the sequence of the insert 
was determined. 

10 These ESTs were extended to form one ORF as 

follows: 

1. 5' extension of EST 7 67735 by RACE (Rapid 
Amplification of cDNA Ends) . 

15 

Marathon-Ready cDNAs (Clontech) are premade 
"libraries" of adaptor-ligated double-stranded cDNA 
ready for use as templates in RACE experiments- Five 
ml Marathon-Ready cDNA was used as template in a 

2 0 regular 50 ml RACE. The RACE mixture contained 1 x 
KlenTaq PCR buffer. 0.2 mM of each dNTP, 1 x 
advantage KlenTaq polymerase mix (Clontech), 0.15 mM 
API adaptor primer and 0.15 mM RACE gene specific 
primer. The amplification conditions were as follows: 

25 94°C for 30 s and 68 °C for 4 min. One-hundred-fold 
diluted RACE product was used as a template in a 
nested PCR with AP2 adaptor and gene specific nested 
PCR primers. Specific nested PCR fragments were 
cloned into pCR2 (TA cloning kit, Invitrogen) and the 

30 sequences of the inserts were determined. Gene- 
specific primer (hh3UNC53 97101702) : 

5' ACCATTTACACCTGAAGACGATTGAGGTCC; nested gene-specific 
primer (hh3UNC53 97101701) 

5' CTCCTATTTAAATTAGAGGCTCCCTGGACC Marathon cDNA 
35 library: human placenta, human heart, human chronic 

myelogenous leukemia, human colorectal adenocarcinoma. 
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2. 3' extension of EST 767735 by RACE . 

Method as described previously. Gene 
specific primer (hh3UNC53 97102702) 
5 5' CAATCGTCTTCAGGTGTAAATGGTAACGTG; nested gene specific 
primer (hh3UNC53 97102703) 

5'GAATGTCAAACACAGTGCCACCTCCACC Marathon cDNA library: 
human placenta, human heart, human HeLa, human 
melanoma . 

10 

3. 3' extension of cDNA clone c09a03 by RACE. 

Method as described previously,"" gene- 
specific primer (hh3UNC53 98020401) 
15 5 ' AGGGAGCACTGAATGGTCCAGACCATCCTC ; nested gene-specific 
primer (hh3UNC53 98020402) 

5'GCATCAGAAGACAGCATTCCTCTGAAAGTG Marathon cDNA 
library: human placenta, human heart, human HeLa, 
human melanoma, human colorectal adenocarcinoma, human 
20 chronic myelogenous leukemia. 

4. 5' extension of cDNA clone 86459 by RACE 

(1) . 

25 Method as described previously gene-specific 

primer (hh3UNC53 98020403) 

5' TTCAATTTCTATCTCTATGAGTTTTCTTCG / nested gene-specific 
primer (hh3UNC53 98020404) 

5'GCAGCTCTAGATTTGGTGATGAAGAAACTC Marathon cDNA 
30 library: human placenta, human heart, human HeLa, 

human melanoma. Overlapping sequences were assembled 
in a single contiguous sequence. 

5. 5' extension cDNA clone 8 6459 by RACE (2) . 

35 

Method as described previously gene-specific 
primer (hh3UNC53 98022502) 
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S' TCAGAATGTGATGAAGGAGGCTTGGTGGAC; nested gene-specific 
primer (hh3UNC53 98022501) 

5' GGATGCCGGAAGGGATGAATCAGTAAGC Marathon cDNA library: 
human placenta, human heart, human HeLa, human 
5 melanoma, human colorectal adenocarcinoma, human 
chronic myelogenous leukemia. 

Validating variants at 5' end of the cDNA 
sequence 

10 

In the final 5' RACE experiment, 2 variants have 
been found whose sequence diverge upstream from the 
IYTDWAN protein sequence (position 289 in figure le or 
position 82' in figure If) . By using primers 

15 AT T T AC AC T G AC T G G G C CAAC and ATAATCTGGATGATTTCTGCTAGGAGT 
on cDNA clones a Hs-unc-53/3 specific PCR product was 
obtained that was radiolabeled using the random primed 
DNA labeling kit (Roche Molecular Biochemicals ) and 
hybridized to human DNA BAC filters (Research 

20 Genetics) . Both primers are located near the IYTDWAN 
box. Four BACs turned out positive (415J11; 464C17, 
525C02 and 537B02) . DNA sequencing of the region 
upstream from the IYTDWAN protein sequence directly on 
these BACs showed that this region was preceded by a 

25 putative intronic sequence as evidenced by the 

multiple stop codons in the reading frame and by the 
consensus AG intron acceptor sequence. For sequencing 
purposes, BAC DNA was prepared according to a modified 
Qiagen plasmid DNA procedure. 

30 A primer pair was designed specifically to 

amplify the 5' end of the variant shown in full in 
figure le (primers AC T T GC T G AAAC AG AG AG C T C CAT G and 
CTTGCTGTCTTCTTTCTCCTTGGC) . PCR with these primers on 
BAC DNA showed the presence of the genomic sequence 

35 encoding this variant in 3 out of the 4 BACs (not 
present in BAC 415J11) . 
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BACs containing the genomic sequence encoding the 
other 5' end variant of Hs-unc-53/3 as shown as the 
variant in figure le were identified by hybridizing 
the Research Genetics human DNA GAC filters with 
5 primer TGATCTTCTAGCGTGTGACTCACTG, radioactively 

labeled using gamma-P32-ATP and polynucleotide kinase. 
Positive BACs were 404F14, 450K18 and 764L15. 

Sequencing directly on the respective BACs in the 3' 
10 direction from within the 2 alternative 5' exons and 
comparison of the genomic DNA sequence with the 
previously determined cDNA sequence identified the GT 
intron donor site. Joining of the genomic sequences 
from both 5' exons and the IYTDWAN encoding sequence 
15 after removal of the predicted intronic sequence 

restored for both variants the sequence of the 5' RACE 
experiment without affecting the translation of the 
Open Reading Frame. 

20 Cloning of Hs-unc-53/3 constructs 

With the aim of cloning the full-length Open 
Reading Frame of Hs-unc-53/3, primer pairs were 
selected such that the ORF could be amplified in 6 

25 overlapping fragments ranging in size from 1 to 2 kbp. 
Overlaps between the fragments were chosen such that 
they contain an endonuclease restriction enzyme 
recognition site suitable for cloning the full-length 
gen. For the 5' fragment, the downstream oriented 

30 primer was chosen to contain the first putative start 
codon (ATG) in variant 1 (the one shown in full in 
figure le) . PCR conditions using the Expand High 
Fidelity PCR system (Roche Molecular Biochemicals ) for 
all of the fragments were as follows. Initial 

35 denaturation for 5' at 95°C; 30 cycles of denaturation 
at 95°C for 45", primer annealing at 55°C for 45" and 
extention at 72°C for 1' (3' for primer combination 
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A+B) ; followed by an additional incubation for 7' at 
72°C and storage at 4°C. PCRs were run on PE 
Biosystems 9700 PCR machines. 

5 



Primer pairs used for cloning Hs-unc-53/3 fragments 



# 


Size 


Primer 


Sequence 




(bp) 






A-B 


2229 


A 


TCAGCTCGAGCATATGCCTGTTCTTGGGGTTGC 






B 


GGGGTGGGTCGACTTGTCAAGTGG 


C-D 


847 


C 


ATGGAAGGACCATACCCAACTTGAC 






D 


CTTGTTCCAGCTTTCTGCCTAGATG 


E-F 


781 


E 


CAGGTTCCTGGAGAAGAGGCATGTC 






F 


GGTGAGGCAATATCTGGATACTTGG 


G-H 


1291 


G 


AGGCAGCCAGGATCCAAGTATCCAG 






H 


TGCGAAGATCTTTTGGGAGGATGGTC 


I-J 


1022 


I 


AACCATTGAAATGCTGAAGGCTCAG 






J 


GGTTATGGGATCTAATTAAGTCTCC 


K-L 


1255 


K 


CACTAGCCTTGGTCTGAGCTCTGAC 






L 


TCACCCTCTAGAGGGTAGATTCAAG 



Primer A contains restriction sites (Xhol and 
nhel) suitable for final subcloning in an eukaryotic 
expression vector (pEGFPc3) and in a yeast-two-hybrid 

25 vector (pAS2-l) , respectively. 

PCR products were analyzed by agarose gel 
electrophoresis and were visualized by ethidium 
bromide staining. Splice variants as mentioned in 
figure le were observed as multiple bands on agarose 

30 gels. Single band PCR products were purified with the 
Qiaquick PCR purification kit, whereas multiple band 
PCR products were cut out from gel as individual bands 
and purified using the Qiaquick gel extraction kit* 
PCR products were cloned in pCR2 . 1 according to the 

35 suppliers protocol (Invitrogen) . For each fragment, 
multiple clones were picked from selective LB agar 
plates and grown overnight under antibiotic selection 
pressure for DNA preparation either on the biorot 9600 
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(Qiagen) , or manually on anion exchange columns 
(Qiagen tip 20 or tip 100) . Insert sequences were 
determined using the Bigdye terminator ready reaction 
cycle sequencing kit (PE Biosystems) . Individual 
5' sequencing reactions for each clone were assembled in 
single sequence contigs using the Sequencher software 
package (GeneCodes) . Sequences were compared to the 
previously determined consensus sequence using the 
SeqEd software package form PE Biosystems. For each 

10 fragment a clone was selected containing the correct 
sequence and the splice variant of interest. For the 
I-J fragment, a clone was selected that missed the 
hart specific 22 amino acid splice variant (figure 
If) . In the K-L fragment clone, a Sfil-SacII linker 

15 was cloned in the BamHI site of the pCR2 . 1 multiple 
cloning site to facilitate subcloning of the full- 
length gene into the yeast-two-hybrid vector (pAS2-l) 
and the eukaryotic expression vector (pEGFPc3) , 
respectively. 

'20 The overall cloning strategy of the full-length 

gene is visualized in figure 7a. 7al illustrates the 
overlapping PCR fragments and the nomenclature of 
fragments and primer pairs. 7a2 illustrates the 
assembly of the 3' half of the gene in pCR2 . 1 . 

25 Internal BamHI (I-J fragment) and Xhol (K-L fragment) 
sites as well as restriction sites from the multiple 
cloning site of pCR2 . 1 (as shown in the figure) were 
removed by side-directed mutagenesis (SDM) using the 
Quickchange Site-Directed mutagenesis kit 

30 (stratagene) . The Notl-EcoRI G-H fragment and the 
EcoRI-Nhel I-Jd22 (d22 indicating that the 22 amino 
acid splice variant is absent) were directionally 
cloned in the NotI and Nhel sites of the K-L fragment 
clone. Multiple clones were picked and verified by 

35 DNA sequencing. 7a3 illustrates the assembly of the 
5' half. Internal Xhol (C-D fragment) and Sfil and 
Xhol (E-F fragment) sites were removed by SDM. 




WO 99/63080 PCT/EP99/03848 

- 55 - 

Inserts were cut out from the vectors by restriction 
digestion with the appropriate restriction enzymes . 
(Xhol+Sall; Sall+Narl and Narl+BamHI, respectively) 
and purified from gel after agarose gel 
5 electrophoresis. The 3 fragments were ligated 

together, re-cut with Xhol and BamHI and separated on 
gel- The band of the expected size was cut out of 
gel, purified and cloned in front of the 3' half, 
opened by digestion with Xhol and BamHI (figure 7a4) . 
10 Multiple clones were picked and verified by 
sequencing. 

Figure 7a illustrates the modular nature of the 
cloning project. For all the possible combinations of 
splice variation within the building block fragments, 
15 one representative clone is available. In view of 

functional analysis, building blocks can be exchanged 
easily by standard technology, either in the pCR2 . 1 
construct or in the final eukaryotic expression or 
yeast-two-hybrid construct.' 

20 

Construct of Hs-unc-53/3 GFP chimeras 

The construction of the mammalian expression 
vectors pGI3303 and pGI-3305 is explained in the 

25 legends of figure 7a, 7b and 7d. pG13303 can be used 
to over-express an mammalian cells or animals a fusion 
protein between eGFP and 1128 AA Oterminal fragment 
of Hs-unc-53/3 (Fig 7c) . pG3305 can be used to 
overexpress in mammalian cells or animals a fusion 

30 protein between eGFP and the 2363 AA full length Hu- 

■unc-53/3 (fig 7d) . The Hs-unc-53/3 cDNA in pGI3303 as 
well as in pGI3305 contains silent mutations that 
introduce or remove specific restriction sites in 
order to be able to easily subclone different types of 

35 alternative splice variants in these vectors. 
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Genomic DNA sequencing (BAC 585E09) 

Using the primers AGGACCCTATGCGGAGGTCAAGCCGC and 
TGGGTTGGCATCATCGCTGTCGTAGC, a PCR specific for Hs-unc- 
5 53/1 was developed- PCR products were radiolabeled 
using the Random Prime DNA labeling kit (Roche 
Molecular Biochemicals) and hybridized on the human 
genomic DNA BAC filters (Research Genetics) . Positive 
signals were obtained for BAC clones 366H21, 483L14, 

10 471J09 and 585E09. BAC DNA was isolated from E. coli 
genomic clone 585E09 according to a modified Qiagen 
plasmid DNA preparation procedure. A shotgun library 
of 1920 clones was constructed at GATC (Konstanz, 
Germany) . BAC DNA was prepared, nebulized and 

15 subcloned after end-repairing in the sequence vector 

pTZ19R. At JRF, DNA was prepared on the Biorobot 9600 
(Qiagen) from 1440 clones* End sequencing reactions 
with M13 forward (TGTAAAACGACGGCCAGT) and reverse 
(CAGGAAACAGCTATGACC) primer were done on 768 clones. 

20 672 additional clones were sequenced with M13 only. 5 
ftl DNA was used in 15 //l final reaction volume using 
the BigDye Terminator Ready Reaction sequencing kit. 
Sequencing reactions were run on MJ Research PTC200 
PCR machines. Reaction products were run and analysed 

25 on PE ABI 377 DNA sequencers. All sequencing results 
were imported in the Sequencher (GeneCodes) software 
package. Contaminating vector sequences and trailing 
sequences of low quality were trimmed. Individual 
sequences were assembled in contigs with standard 

30 software settings. A great number of contigs were 

constructed ranging from below 500 bp to over 10 kbp. 
Singletons are also still present. By looking for 
strings of known sequence, a contig was found 
containing the known and reliable 5' end of hUNC53hl 

35 and extending this sequence in 5' direction. This 
sequence and its relevant features are described in 
figure lg and its legend. 
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Northern blotting 

A Human multiple tissue Norther (MTN-1, 
Clontech) containing in each lane 2 mg of poly A + RNA 
5 from eight different human tissues (heart, brain, 

placenta, lung, liver, skeletal muscle, kidney, and 
pancreas) and a MTN-II human multiple tissue Northern, 
containing in each lane 2 mg of poly A + RNA from 
spleen, thymus, prostate, testis, ovary, small 

10 intestine, colon and peripheral leukocyte, were 
hybridized according to the manufacturer's 
instructions and washed out in 0.1xSSC:0.2% SDS at 
55°C. Also from Clontech, a poly A + RNA blot from 
human cancer cell lines (melanoma G361, lung carcinoma 

15 A549, colorectal adenocarcinoma SW480, Burkitt' s 

lymphoma Raji Leukemia Molt 4, lymphoblastic leukemia 
K562, HeLa S3 and promyelocytic leukemia HL60) was 
tested. 

20 Cancer cell lines RNA blots probed with Hs-unc- 

53/3 

A set of cancer cell line Northern blots were 
probed with a 665 bp fragment of Hs-unc-53/3 amplified 

25 by using the primers 5' AGGAATTAAAATTAACGGATATTCGG and 
5'AAAACTGTCCAAACTATTTTCTTCTACC. HU-unc-53/3 is 
expressed in Melanoma G361 and lung carcinoma A54 9, 
transcripts sizes were detected of >0.5 kb. No 
expression was detected in promyelocytic leukemia HL- 

30 60 HeLa cell S3, chronic myelogenous leukemia K-562, 
leukemia MOLT-4, Burkitt' s lymphoma Raij and 
colorectal adenocarcinoma SW480. 

Normal human tissue RNA blots probed with Hs-unc- 
35 53/3 

A set of normal human tissue Northern blots were 
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probed with a 665 bp fragment of Hs-unc-53/3 amplified 
by using the primers 5' AG GAAT TAAAAT TAAC GG AT AT TCGG and 
5' AAAACTGTCCAAACTATTTTCTTCTACC • High expression 
levels were detected in brain, spleen, ovary and 
5 spinal cord, lower levels in heart, placenta, testis, 
stomach, and adrenal gland. Transcripts sizes were >= 
9.5 kb. 

FISH 

10 

Hs-UNC-53/3 is localised to chromosome 12q21.1 
Slides preparation: 

15 Lymphocytes isolated from human blood were 

cultured in a-minimal essential medium (MEM) 
supplemented with 10% foetal calf serum and 
phytohaemagglutinin (PHA) at 37°C for 68-72 hr. The 
lymphocyte cultures were treated with BrdU (0.18mg/ml 

20 Sigma) to synchronise the cell population. The 

synchronised cells were washed three times with serum- 
free medium to release the block and recultured at 
37°C for 6 hr in a a-MEM with thymidine (2.5£zg/ml: 
Sigma) . Cells were harvested and slides were made by 

25 using standard procedures including hypotonic 
treatment fix and air-dry. 

In situ hybridisation and FISH detection: 

30 A cDNA probe was biotinylated with dATP using the 

BRL BioNick labelling kit (15°C, 1 hr) Heng et al, 
1992) . The procedure for FISH detection was performed 
according to Heng et al . , 1992 & Heng and Tsui, 1993. 
Heng et al..: Proc Natl Acad Sci USA 89: 9509-9513 

35 (1992). Heng et al . Chromosoma 102: 325-332 (1993). 

Briefly, slides were baked at 55°C for 1 hour. After 
RNase treatment, the slides were denatured in 70% 
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formamide in 2xSSC for 2 min. at 70°C followed by 
dehydrated with ethanol . Probes were denatured at 
75°C for 5 min. in a hybridisation mix consisting of 
50% formamide and 10% dextran sulphate. Probes were 
5 loaded on the denatured chromosomal slides. After 
over night hybridisation, slides were washed and 
detected as well as amplified. FISH signals and the 
DAPI banding pattern were recorded separately by 
taking photographs, and the assignment of the FISH 
10 mapping data with chromosomal bands was achieved by 
superimposing FISH signals with DAPI banded 
chromosomes (Heng et al, 1993) . 

Results 

15 

Under the condition used the hybridisation 
efficiency was approximately 67% for this probe (among 
100 checked mitotic figures, 67 of them showed signals 
on one pair of the chromosomes) . Since the DAPI 
20 banding was used to identify the specific chromosome, 
the assignment between signal from probe' arid the long 
arm of chromosome 12 was obtained. The detailed 
position was further determined in the diagram based 
on the summary from 10 photos. 

25 

Radiation Hybrid Mapping 

Radiation hybrid analysis is a PCR technique and 
the panels of radiation hybrid DNA are provided at a 

30 concentration of 25 nq/fxl in TE buffer suitable for 

these reactions. Typically, 25 ng of DNA is used in a 
10 //l PCR reaction. 

Some of the radiation hybrid panels are supported 
by an e-mail server which can assist you in the 

35 chromosome localization of markers. A server for the 
chromosome localization of markers using the Stanford 
G3 and Stanford TNG panels is available at http://www- 
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shgc.stanford.edu. At the time of catalog 
publication, the Stanford TNG server was capable of 
chromosome localization only on chromosomes 2, 4, 7 
and 21. Chromosorne localization of markers from the 
5 GeneBridge4 panel may be performed by accessing the 
server at http://www-genome.wi.mit.edu. RH mapping 
involves the statistical analysis of several to many 
markers to determine the relative order of the markers 
with respect to one another. RH mapping can be 

10 achieved using statistical programs that will provide 
the best map along with a measure of the relative 
likelihood of one order versus another. 

This type of analysis has been shown to 
successfully, generate the order of markers on the RH 

15 map that is significantly more likely than any 

alternative order. Two statistical programs for RH 
mapping can be downloaded from the World Wide Web free 
of charge. SAMapper was produced at the Stanford 
Human Genome Center and be downloaded at http://www- 

2 0 shgc . Stanford . edu/Mapping/SAMapper/ index .html RHMAP 
was written by Michael Boehnke at the University of 
Michigan and can be downloaded at 

http: //www.sph.umich. edu/group/statgen/sof tware. A 
comprehensive web page regarding radiation hybrid 
25 mapping, with links to web sites with analysis 
software' and other information, can be found at 
http : //linkage . rockefeller . edu/tara/rhmap/ 

Transfection protocol for cells 

30 

N$ neuroblastoma lines were seeded in Lab Tek 
chambered coverglass (Nalgene Nunc International) and 
transfected with pEGFP (control), pGI3303 and pGI3305 
using lipof ectamine (Life Technologies BRL) . After 
35 24-48 hours, the chambered coverglasses were placed on 
an inverted fluorescence microscope where GFP 
fluorescence could be visualized in living cells. The 
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details of this method have been described in 
PCT/EP96/02311. 

Microscopy and fluorescence staining using 
5 phalloidin 

have been described earlier (EP97/06956) . 

SEQUENCE LISTING 

10 

Seq ID No 1 is a nucleic acid sequence of Hs unc-53/1 
and lacking the nucleotides from position 2873 to 3043 
shown in Fig. la. 

15 Seq ID No. 2 ts a nucleic acid sequence of Hs unc-53/1 
and lacking the nucleotides from position 3098 to 3121 
shown in Figure la. 

Seq ID no. 3 is a nucleic acid sequence of Hs-unc-53/1 
20 and lacking the nucleotides from position 3518 to 3526 
of the sequence identified in Fig. la. 

Seq ID No. 4 is an amino acid sequence of Hs-unc-53/1 
protein and lacking the amino acids from position 958 
25 ' to 1014 of the sequence identified in' Fig. lb 

Seq ID No. 5 is a amino acid sequence of Hs-unc-53/1 
protein and lacking the amino acids from position 1033 
to 1040 of the sequence identified in Fig. lb. 

30 

Seq ID No. 6 is a amino acid sequence of Hs-unc-53/1 
protein and lacking the amino acids from position 1173 
to 1175 of the sequence identified in Fig. lb. 

35 Seq ID No. 7 is a nucleotide sequence encoding Hs- 
unc-53/2 and lacking the nucleotides from position 
5425 to 5433 of the sequence illustrated in Fig. 1c. 
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Seq ID No. 8 is a nucleotide sequence encoding Hs- 
unc-53/2 and lacking the nucleotides from position 
5924 to 6024 of the sequence illustrated in Fig. lc. 

5 Seq ID No. 9 is a nucleotide sequence encoding Hs- 

unc-53/2 and having the sequence of variant 1 

illustrated in Fig. lc. 

Seq ID No. 10 is a nucleotide sequence encoding Hs- 

10 unc-53/2 and having the sequence of variant 2 

illustrated in Fig. lc. 

Seq ID No. 11 is a nucleotide "sequence encoding "Ks- 

unc-53/2 and having the sequence of variant 3 

15 illustrated in Fig. lc. 

Seq ID No. 12 is a nucleotide sequence encoding Hs- 
unc-53/2 and having the sequence of variant 1 
illustrated in Fig. lc. and lacking the nucleotides 
20 from position 5425 to 5433 of the sequence illustrated 
in Fig. lc. 

Seq ID No. 13 is a nucleotide sequence encoding Hs- 
unc-53/2 and having the sequence of variant 1 
25 illustrated in Fig. lc. and lacking the nucleotides 
from position 5924 to S024 of the sequence illustrated 
in Fig. lc. 

Seq ID No. 14 is a nucleotide sequence encoding Hs- 
30 unc-53/2 and having the sequence of variant 2 

illustrated in Fig. lc. and lacking the nucleotides 
from position 5425 to 5433 of the sequence illustrated 
in Fig. lc. 

35 Seq ID No. 15 is a nucleotide sequence encoding Hs- 
unc-53/2 and having the sequence of variant 2 
illustrated in Fig. lc. and lacking the" nucleotides 
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from position 5924 to 6024 of the sequence illustrated 
in Fig. lc. 

Seq ID No. 16 is a nucleotide sequence encoding Hs- 
5 unc-53/2 and having the sequence of variant 3 

illustrated in Fig. lc. and lacking the nucleotides 
from position 5425 to 5433 of the sequence illustrated 
in Fig. lc. 

10 Seq ID No. 17 is a nucleotide sequence encoding Hs- 
unc-53/2 and having the sequence of variant 3 
illustrated in Fig. ic -_ and lacking the_ nucleotides, 
from position 5324 tc 6024 of the sequence illustrated '■ 
in Fig. lc. 

15 

Seq ID No. 18 is an amino acid sequence of Hs-unc- 
53/2 protein and lacking the amino acids from position 
1776 to 1778 of the sequence identified in Fig. Id 

20 Seq Id No. 19 is an amino acid sequence of variant 1 
of Hs-unc-53/2 sequence illustrated in Fig. Id, 

Seq Id No. 20 is an amino acid sequence of variant 2 

of Hs-unc-53/2 sequence illustrated in Fig. Id. 

25 

Seq Id No. 21 is an amino acid sequence of variant 3 

of Hs-unc-53/2 sequence illustrated in Fig. Id. 

Seq Id No. 22 is an amino acid sequence of variant 1 
30 of Hs-unc-53/2 sequence illustrated in Fig. Id and 

lacking the amino acids from position 1776 to 1778 -of 
the sequence identified in Fig. Id. 

Seq Id No, 23 is an amino acid sequence of variant 2 
35 of Hs-unc-53/2 sequence illustrated in Fig. Id and 

lacking the amino acids from position 1776 to 1778 of 
the sequence identified in Fig. Id. 
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Seq Id No. 2 4 is an amino acid sequence of variant 3 
of Hs-unc-53/2 sequence illustrated in Fig. Id and 
lacking the amino acids from position 1776 to 1778 of 
the sequence identified in Fig. Id. 

5 

Seq ID No. 25 is a nucleotide sequence encoding Hs- 
unc-53/3 as illustrated in Figure le. 

Seq ID No. 26 is a nucleotide sequence encoding Hs- 
10 unc-53/3 as illustrated in Figure le and lacking the 

nucleotides from position 3795 to 4283 of the sequence 
identified therein. 

Seq ID No. 27 is a nucleotide sequence encoding Hs- 
15 unc-53/3 as illustrated in Figure le and lacking the 

nucleotides from position 4284 to 4325 of the sequence 
identified therein. 

Seq ID No. 28 is a nucleotide sequence encoding Hs- 
20 unc-53/3 as illustrated in Figure le and lacking the 

nucleotides from position 3795 to 4325 of the sequence 
identified therein. 

Seq ID No. 29 is a nucleotide sequence encoding Hs- 
25 unc-53/3 as illustrated in Figure le and lacking the 

nucleotides from position 5153 to 5173 of the sequence 
identified. 

Seq ID No. 30 is a nucleotide sequence encoding Hs- 
30 unc-53/3 as illustrated in Figure le and lacking the 

nucleotides from position 5343 to 5408 of the sequence 
identified. 



35 



Seq ID No. 31 is a nucleotide sequence encoding Hs- 
unc-53/3 having the sequence of variant 1 illustrated 
in Fig. le. 
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Seq ID No, 32 is a nucleotide sequence encoding Hs- 
unc-53/3 having the sequence of variant 1 illustrated 
in Fig. le and lacking the nucleotides- from position 
3795 to 4283 of the sequence identified therein. 

Seq ID No. 33 is a nucleotide sequence encoding Hs- 
unc-53/3 having the sequence of variant 1 illustrated 
in Fig. le and lacking the nucleotides from position 
4284 to 4325 of the sequence identified therein. 

Seq ID No. 34 is a nucleotide sequence encoding Hs- 
unc-53/3 having the sequence of variant 1 illustrated 
in Fig. le and lacking the nucleotides from position 
3795 to 4325 of the sequence identified therein. 

Seq ID No. 35 is a nucleotide sequence encoding Hs- 
unc-53/3 having the sequence of variant 1 illustrated 
in Fig. le and lacking the nucleotides from position 
5153 to 5173 of the sequence identified therein. 

Seq ID No. 36 is a nucleotide sequence encoding Hs- 

unc-53/3 having the sequence of variant 1 illustrated 

in Fig. le and lacking the nucleotides from position 

5343 to 5408 of the sequence identified therein. 

Seq ID No. 37 is an amino acid sequence of Hs-unc- 
53/3 protein as identified in the sequence of Fig. 
If. 

Seq ID No. 38 is an amino acid sequence of Hs-unc- 
53/3 protein as identified in the sequence of Fig. If 
and lacking the amino acid residues from position 1326 
to 1413 of the sequence identified therein. 



35 



Seq ID No. 39 is an amino acid sequence of Hs-unc- 
53/3 protein as identified in the sequence of Fig. If 
and lacking the amino acid residues from position 1414 
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to 1427 of the sequence identified therein. 

Seq ID No. 4 0 is an amino acid sequence of Hs-unc- 
53/3 protein as identified in the sequence of Fig. If 
5 and lacking the amino acid residues from position 1703 
to 1709 of the sequence identified therein. 

Seq ID No. 41 is an amino acid sequence of Hs-unc- 
53/3 protein as identified in the sequence of Fig. If 
10 and lacking the amino acid residues from position 1768 
to 1788 of the sequence identified therein. 

Seq ID No. 42 is an amino acid sequence of Hs-unc-53 

of variant 1 identified in Figure If. 

15 

Seq ID No. 43 is an amino acid sequence of Hs-unc-53 

of variant 1 identified in Figure If and lacking the 

amino acid residues from position 1326 to 1413 of the 
20 sequence identified therein. 

Seq ID No. 44 is an amino acid sequence of Hs-unc-53 
of variant 1 identified in Figure If and lacking the 
amino acid residues from position 1414 to 1427 of the 
25 sequence identified therein. 

Seq ID No. 45 is an amino acid sequence of Hs-unc-53 
of variant 1 identified in Figure If and lacking the 
amino acid residues from position 1703 to 1709 of the 
30 sequence identified therein. 

Seq ID No. 46 is an amino acid sequence of Hs-unc-53 
of variant 1 identified in Figure If and lacking the 
amino acid residues from position 1768 to 1788 of the 
35 sequence identified therein. 
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CLAIMS 

1, A vertebrate protein homologue of a UNC-53 
protein of C. eleaans / which protein comprises an 
5 amino acid sequence having one or more of sequence 
blocks A, B, C, D, E, F, G, or H as illustrated in 
figure 4 or which differs from said blocks in 
conservative amino acid changes. 

10 2. A vertebrate protein homologue of UNC-53 

protein of C, eleaans or a functional equivalent, 
derivative or bioprecursor therefor having an amino 
acid sequence encoded by the nucleotide sequence 
illustrated in figure 1(e) or the sequence of Figure 1 

15 e having nucleotide region from position 1 to 288 

replaced with the sequence of variant 1 illustrated in 
Figure le and or which sequences further lack any of 
the sequences form 3795 to 4283, 4284 to 4325, 5153 to 
5173 or 5343 to 5408. 

20 

3. A vertebrate protein homologue of UNC-53 
protein of C. eleaans having an amino acid sequence as 
illustrated in figure 1(f) or an amino acid sequence 
which differs from said amino acid sequence 

25 illustrated in figure 1(f) by the replacement of amino 
acids 1 to 81 with the sequence of variant 1 in figure 
If and /or including deletions from position 1326 to 
1413, 1414 to 1427, 1703 to 1709 or 1768 to 1788, or 
which differs from said sequences in one or more 

30 conservative amino acid changes. 

4. A cDNA molecule encoding a vertebrate 
homologue of UNC-53 protein of C. eleaans according to 
any of claims 1 to 3 . 

35 

5. A.cDNA molecule according to claim 4 which 
cDNA comprises the sequence of nucleotides illustrated 
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in figure 1 (e) . 

6. A nucleic acid molecule capable of 
hybridising to the cDNA sequences according to claims 

5 4 or 5 under high stringency conditions. 

7. A DNA expression vector which comprises a 
cDNA molecule as claimed in claim 4 or 5. 

10 8. A vector according to claim 7 which 

comprises a promoter of C, eleaans UNC-53 protein or a 
vertebrate homologue thereof according to any of 
claims 1 to 7 . 

15 9. A vector according to claim 8 wherein said 

promoter sequence is derived from a gene encoding a 
mouse or human homologue of a UNC-53 protein of C_j_ 
eleaans . 

20 10. A vector according to any of claims 7 to 9 

which further comprises a sequence encoding a reporter- 
molecule. 

11. A vector according to claim 10 wherein said 
25 reporter molecule is a fluorophore. 

12. A host cell transformed or transfected with 
the vector of any of claims 7 to 11. 

30 13. A host cell transformed or transfected with 

the vector of claims 10 or 11. 

14. A host cell according to claim 12 or 13 
which cell comprises a prokaryotic cell, such as a 
35 bacterial cell or a eukaryotic cell such as a fungal, 
and animal, a plant or an insect cell. 
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15. A transgenic cell, tissue or organism 
comprising a transgene capable of expressing a protein 
according to any of claims 1 to 3. 

5 16. A transgenic cell, tissue or organism 

according to claim 15 which comprises any of a COS 
cell, Hep G2, MCF-7 cell, N4 mouse neuroblastoma cell, 
a NIH3T£ cell, or colorectal carcinoma or human 
derived cells. 

17. A transgenic cell, tissue or organism 
according to claim 15 or 16 wherein said transgene 
comprises a vector according to any of claims 7 to 11. 

15 18. A transgenic cell, tissue or organism 

according to claim 15 or .17 wherein said transgene 
comprises a vector according to claim 10 or 11. 

19. A transgenic cell, tissue or organism 
20 according to any of claims 15 to 17 wherein said 

organism comprises any of an insect, a fungus, a non- 
human mammal, a plant or a nematode worm. 

20. A method of producing a mutant vertebrate 
25 non-human organism which mutation affects cell 

behaviour or the regulation of cell motility or the 
shape or .the direction of cell migration, which method 
comprises inducing a mutation in the wild type gene 
encoding the vertebrate homologue of an UNC-53 
30 C. eleaans protein. 

21. A vertebrate protein homologue of an UNC-53 
protein of C. eleaans , according to any of claims 1 to 
3 for use as a medicament. 



35 



22. Use of a vertebrate protein homologue of an 
UNC-53 protein of C. eleaans , according to any of 
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claims 1 to 3 in the manufacture of a medicament for 
promoting neuronal regeneration, revascularisation/ 
wound healing or for treatment of chronic 
neurodegenerative diseases or acute traumatic injuries 
5 or fibrotic disease or autoimmune diseases such as 
rheumatoid arthritis and sclerosis. 

23. A pharmaceutical composition comprising a 
vertebrate homologue of an UNC-53 protein of CL. 

10 eleaans , according to any of claims 1 to 3 together 

with a pharmaceutically acceptable carrier, diluent or 
excipient therefor. 

24. A nucleic acid or cDNA molecule according to 
15 any of claims 4 to 6 or a functional fragment thereof 

for, use as a medicament. 

25. Use of nucleic acid or cDNA molecule 
according to any of claims 4 to 6 in the manufacture 

20 of a medicament to promote neuronal regeneration, 

revascularisation or wound healing, or for treatment 
of chronic neurodegenerative diseases or acute 
traumatic injuries or fibrotic disease or autoimmune 
diseases such as rheumatoid arthritis and sclerosis. 

25 

26. A pharmaceutical composition comprising a 
nucleic acid or cDNA molecule according to any of 
claims 4 to 6 and a pharmaceutically acceptable 
carrier, diluent or excipient therefor. 

30 

27. A method of determining whether a compound 
is an inhibitor or enhancer of the regulation of cell 
behaviour, growth, cell shape or motility or the 
direction of cell migration, which method comprises 

35 contacting said compound with a host cell according to 
claim 12 or 14 or a transgenic cell as claimed in any 
of claims 15 to 18 and screening for a phenotypic 
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change in said cell, 

28. A method according to claim 27 wherein said 
phenotypic change to be screened is a change in cell 
5 growth, or shape or a change in cell motility or 
filopodia outgrowth, ruffling behaviour, cell 
adhesion, contact inhibition or the length of neurite 
growth . 

10 29. A method as claimed in claim 27 wherein said 

transgenic cell is an N4 neuroblastoma cell and the 
phenotypic change to the screened is the length of 
neurite growth. 

15 30. A method as- claimed in claim 27 wherein said 

transgenic cell is an MCF-7 breast carcinoma cell or . 
an NIH3T3 cell and the phenotypic change to be 
screened is the extent of phagokinesis or contact 
inhibition. 

20 

31. A method of determining whether a compound 
is an inhibitor or an enhancer of the regulation of 
cell shape, cell growth or motility or of the 
direction of cell migration, which method comprises 

25 administering said compound to a transgenic organism 
according to any of claims 15 to 19 or a mutant 
organism produced according to the method of claim 20 
and screening for a phenotypic change in said 
organism. 

30 

32. A compound which is identifiable by the ( 
method according to claim 27 as an enhancer of the 
regulation of cell shape, or growth or motility or the 
direction of cell migration for use as a medicament. 

35 

33. Use of a compound which is identifiable by 
the method according to claim 27 as an enhancer of the 
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regulation of cell shape, or growth or motility or the 
direction of cell migration in the preparation of 
medicament for promoting neuronal regeneration, 
revascularisation or wound healing or for treatment of 
5 chronic neurodegenerative diseases or acute traumatic 
injuries or fibrotic disease autoimmune diseases such 
as rheumatoid arthritis or sclerosis. 

34. A pharmaceutical composition comprising a 
10 compound identified according to the method of any of 

claims 27 to 31 and a pharmaceutical^ acceptable 
carrier, diluent or excipient therefor. 

35. A compound which is identifiable by the 

15 method according to any one of claims 17 to 31 as an 

inhibitor of the regulation of cell motility, growth, 

or shape, or the direction of cell migration, for use 
as a medicament. 

20 36. Use of a compound according to claim 35 in 

the manufacture of a medicament for alleviating the 
spread of disease inducing cells or metastasis or loss 
of contact inhibition. 

25 37. A pharmaceutical composition comprising the 

compound as claimed in claim 35, and a 
pharmaceutical^ acceptable carrier diluent or 
excipient therefor. 

30 38. A method of determining whether a compound 

is an inhibitor or an enhancer of transcription of a 
gene encoding a vertebrate homologue of UNC-53 protein 
of C. elecrans , according to any of claims 1 ro 3 which 
method comprises the steps of (a) contacting said 

35 compound with a cell according to claim 13 or 18 and 

(b) monitoring the level of said reporter molecule and 
comparing the results obtained from said monitoring 
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step with a control comprising a cell according to 
claims 13 or 18, which cell has not been contacted 
with said compound. 

5 39. A method as claimed in claim 38 wherein said 

reporter molecule detected is mRNA or green 
fluorescent protein. 

40. A compound which is identifiable by the 

10 method according to claims 38 or 39, as an enhancer of 
transcription of a gene coding for a vertebrate 
homologue of an UNC-53 protein of C. eleaans according 
to any of claims 1 to 3 or a functional fragment of 
said gene, for use as a medicament. 

15 

41. Use of a compound which is identifiable by- 
the method of claims 38 or 39, as an enhancer of 
transcription of a gene coding for a vertebrate 
homologue of an UNC-53 protein of C. eleaans according 

20 to any of claims 1 to 3 or a functional fragment of 
said gene, in the manufacture of a medicament for 
promoting neuronal regeneration, revascularisation or 
wound healing, or for treatment of chronic neuro- 
degenerative diseases or acute traumatic injuries or 

25 fibriotic disease or autoimmune diseases such as 
rheumatoid arthritis or sclerosis. 



42. A pharmaceutical composition which comprises 
the compound of claim 40 and a pharmaceutically 

30 acceptable carrier, diluent or excipient therefor. 

43. A compound which is identifiable by the 
method of claims 38 or 29 as an inhibitor of 
transcription of a gene coding for vertebrate 

35 homologue of a UNC-53 protein of C. eleaans according 
to any of claims 1 to 3 or a functional fragment of 
said gene for use as a medicament. 
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44. Use of a compound which is identifiable by 
the method of claims 38 or 39 as an inhibitor of 
transcription of a gene coding for a vertebrate 
homologue of an UNC-53 protein of C . eleaans or a 
5 functional fragment of said gene, in the manufacture 
of a medicament for alleviating spread of disease 
inducing cells or metastasis or loss of contact 
inhibition. 

10 45. A pharmaceutical composition which comprises 

the compound of claim 43 and a pharmaceutical^ 
acceptable carrier, diluent or excipient therefor. 

4 6. A kit for determining whether a compound is 
15 an enhancer or an inhibitor of the regulation of cell 
motility, growth or shape or the direction of cell 
migration which kit comprises at least one transgenic 
cell as claimed in any one of claims 13 to 17 to be 
contacted with said compound and at least one cell 
20 according to claims 1 2to 19 to be used as a control 
and means for contacting said compound with one of 
said' at lest one transgenic cells. 

47. A kit for determining whether a compound is 
25 an inhibitor or an enhancer of transcription of a 

gene coding for a vertebrate homologue of an UNC-53 
protein of C, eleaans or a functional fragment of said 
gene which kit comprises at- least one cell as claimed 
in any one of claims 12 to 19 and means for contacting 
30 said compound with said cells. 

48. A kit for determining whether a compound is 
an enhancer or an inhibitor of the activity of a 0 
vertebrate homologue of an UNC-53 protein of 

35 C. eleaans or a functional equivalent, derivative, 

fragment or bioprecursor of said vertebrate homologue 
protein, which kit comprises at least, one vertebrate 



mutant non-human organism produced according to the 
method as claimed in claim 20 or a transgenic organism 
as claimed in claims 15 to 19 and a wild type of said 
vertebrate mutant organism. 

49. A method identifying vertebrate 
homologues of an unc-53 gene of C . eleaans or a 
functional fragment thereof, which method . comprises 
hybridizing to a DNA library a suitable 
oligonucleotide sequence of between 15 to 50 
nucleotides of the nucleic acid sequence encoding UNC- 
53 or a functional equivalent, derivative or 
bioprecursor thereof, under appropriate conditions of 
stringency to identify genes having statistically 
significant homology with the cDNA according to any of 
claims 4 or 5. 

50. A method of identifying a protein which is 
active in the signal transduction pathway of a cell of 
which a vertebrate homologue of an UNC-53 protein of 
C. eleaans according to any of claims 1 to 3 is a 
component, which method comprises: 

(a) contacting an extract of said cell with an 
antibody to the vertebrate homologue of the 
UNC-53 protein of C . eleaans ^ 

(b) identifying the antibody/vertebrate 
homologue complex, and 

(c) analysing the complex to identify any 
protein bound to the vertebrate homologue of 
UNC-53 protein of C. eleaans other than the 
antibody . 

51. A method of identifying a further protein 
which is active in the signal transduction pathway of 
a cell of which a vertebrate homologue of an UNC-53 
protein according to any of claims 1 to 3 is a 
component, which method comprises: 
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(a) forming an antibody to the first 
identified protein bound to the vertebrate 
homologue of UNC-53 protein of C. eleaans in 
claim 50, 

5 (b) contacting a cell extract with said 

antibody and identifying the 
antibody/protein complex, 

(c) analysing the complex to identify any 
further protein bound to the first protein 

10 other than the antibody, and 

(d) optionally repeating steps (a) to (c) 
to identify further proteins in said 
pathway. 

15 52. A method of identifying a protein which is 

active in the signal transduction pathway of a cell of 
which a vertebrate homologue of an UNC-53 protein of 
C . eleaans according to any of claims 1 to 3 is a 
component, which method comprises: 
20 (a) contacting an extract of said cell with 

said vertebrate homologue of an UNC-53 
protein of C. eleaans , 

(b) identifying any vertebrate homologue of. 
UNC-53 protein/protein complex formed and 

25 (c) analysing the complex to identify any 

protein bound to the vertebrate homologue of 
UNC-53 protein other than the same 
vertebrate homologue of UNC-53 protein. 

-\ 

30 53. A method according to claim 52 which further 

comprises contacting a cell extract with any protein 
identified from step (c) not being the same as the 
vertebrate homologue of UNC-53 protein used and 
repeating steps (b) and (c) so as to identify any 

35 further protein involved in the signal transduction 
pathway of said cell. 
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54. A method of identifying a protein involved 
in the signal transduction pathway of a cell of which 
a vertebrate homologue of an UNC-53 protein of 
eleaans is a component which method comprises: 
5 (a) providing an appropriate host cell 

having a DNA construct comprising a reporter 
gene under the control of a promoter 
regulated by a transcription factor having a 
DNA binding domain and an activating domain, 
10 (b) expressing in said host cell a first 

hybrid DNA sequence encoding a first fusion 
of a fragment or all of a DNA sequence 
according to claims 4 or 5 and either said 
DNA binding domain or the activating domain 
15 of the transcription factor, 

(c) expressing in the host cell at least 
one second hybrid DNA sequence encoding a 
putative binding protein to be investigated 
together with the DNA binding or activating 

20 domain of the transcription factor which is 

not incorporated in the first fusion, 

(d) detecting any binding of the protein 
being investigated with a protein according 
to any of claims 1 to 3 by detecting for the 

25 production of any reporter gene product in 

said host. 



55. A protein identified by the method of any 
one of claims 50 to 54 for use as a medicament. 

30 

56. Use of a protein identified by the methods 
of any one of claims 50 to 54 in the manufacture of a 
medicament for promoting neuronal regeneration, 
revascularisation or wound healing, or for treatment 

35 of chronic neurodegenerative diseases or acute 

traumatic injuries or fibrotic disease or autoimmune 
diseases such as rheumatoid arthritis and sclerosis. 
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57. A pharmaceutical composition comprising a 
protein identified by the methods of any one of claims 
50 to 54 and a pharmaceutical^ acceptable carrier, 
diluent, or excipient therefor. 

58. A process for producing a vertebrate 
homologue of an UNC-53 protein of C. eleaans according 
to any of claims 1 to 3 which process comprises 
culturing the cells of any of claims 12 to 14 and 
recovering said vertebrate homologue of UNC-53 protein 
expressed. 

59. A process for producing a vertebrate 
homologue of an UNC-53 protein of C. eleaans according 
to any of claims 1 to 3 which process comprises 
culturing an insect cell transfected with a 
recombinant Baculovirus vector, said vector comprising 
a DNA insert encoding said vertebrate homologue of 
UNC-53 protein downstream of the Baculovirus 
polyhedrin promoter, and recovering the expressed 
vertebrate homologue of UNC-53 protein. 

60. A method of detecting whether a compound is 
an inhibitor or an enhancer of expression of a 
vertebrate homologue of an UNC-53 of C. eleaans 
according to any of claims 1 to 3 which method 
comprises contacting a cell expressing said homologue 
with said compound and monitoring for a phenotypic 
change compared to a control cell which has not been 
contacted with said compound. 

61. A method according to claim 60 wherein said 
cell comprises a cell according to any of claims 12 to 



62. A method according to claim 60 wherein said 
cell has undergone loss of contact inhibition. 



19. 
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63. A method according to any of claims 60 to 62 
in which the compound to be tested comprises a nucleic 
acid. 

5 64. A method according to claim 63 wherein said 

nucleic acid sequence comprises an antisense DNA or 
RNA sequence. 

65. A method according to claim 64 wherein said 
10 mRNA sequence comprises 3' untranslated regions of 

mRNA encoding for said vertebrate homologue. 

66. A method according to any of claims 60 to 62 
wherein said compound to be tested comprises a protein 

15 having an amino acid sequence potentially suitable for 
inhibiting function of said vertebrate homologue. 

67. A method according to claim 66 wherein said 
protein comprises a protein identified according to 

20 any of the methods of claims 50 to 54.. 

68. A pharmaceutical composition comprising a 
compound identified according to any of claims 60 to 
67 together with a pharmaceutically acceptable 

25 carrier, diluent or excipient therefor. 

69. A nucleic acid sequence identified according 
to the method of any of claims 63 to 65 for use as a 
medicament. 

30 

70. Use of a nucleotide sequence identified 
according to the method of any one of claims 63 to 65 
in the preparation of a medicament for the treatment 
of loss of contact inhibition or cancer which is 

35 mediated by a vertebrate homologue of an UNC-53 
protein of C. eleaans . 
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71. Use of a nucleic acid according to claim 69 
in the preparation of a medicament for inhibiting 
expression of a gene coding for a vertebrate homologue 
of an UNC-53 protein of C. eleaans . 

72. An assay for detecting expression of a 
vertebrate homologue of iJNC-53 protein of C. eleaans 
according to any of claims 1 to 3 in a vertebrate cell 
which assay comprises contacting a cell or an extract 
thereof with an antibody to said vertebrate homologue, 
which antibody is linked to a reporter molecule, 
removing any unbound antibody and monitoring for the 
presence of said reporter molecule. 

73. An assay according to claim 72 wherein said 
reporter molecule is an antibody conjugated with a 
suitable fluorophore or detectable enzyme, 

74. A method for detecting for expression of a 
gene coding for a vertebrate homologue of an UNC-53 
protein of C . eleaans according to any of claims 1 to 
3 which method comprises contacting a probe specific 
for a nucleic acid or protein sequence coding for or 
corresponding to said vertebrate homologue according 
to any of claims 1 to 3 with a cell extract which 
probe is linked to a reporter and analysing for the 
presence of said reporter. 

75. A method according to claim 74 wherein said 
probe comprises a complementary sequence to a region 
of mRNA transcribed from said gene encoding said . 
vertebrate homologue of UNC-53 protein. 



76. A method according to claim 75 wherein said 
complimentary sequence is a 3' or 5' untranslated 
region of said mRNA. 
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77. A method according to claims 74 or 76 
wherein said reporter comprises a radiolabel. 

78. A method according to claim 74 wherein said 
probe comprises an antibody specific for said 
vertebrate homologue of said UNC-53 protein according 
to any of claims 1 to 3 . 

79. A method according to claim 78 wherein said 
reporter comprises an antibody conjugated with a 
detectable fluorophore or enzyme, 

80. A method of determining whether a compound 
is an inhibitor or an enhancer of association of a 
vertebrate homologue according to any of claims 1 to 3 
to microtubules or plus end regions thereof, which 
method comprises :- 



(a) contacting said compound with a 
transgenic cell, tissue or organism 
expressing UNC-53 protein or said vertebrate 
homologue and which protein is operably 
linked to a reporter molecule, 

(b) screening for the localisation of said 
reporter molecule as compared to a cell 
according to step (a) which has not been 
contacted with said compound. 



81. A compound identifiable by the method 
according to claim 80. 

82. A compound according to claim 81 for use as 
a medicament. 

83. Use of a compound according to claim 81 as 
an enhancer of association of said vertebrate 
homologue with microtubules or the plus end region 
thereof, for use in promoting neuronal regeneration, 
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revascularisation or wound healing, or for treating 
chronic neurodegenerative diseases or acute traumatic 
injuries or fibrotic disease or autoimmune diseases 
such as rheumatoid arthritis or sclerosis. 



5 



84. A pharmaceutical composition comprising the 
compound according to claims 81 or 82 and a 
pharmaceutically acceptable carrier, diluent or 
excipient therefor . 



10 



85. A kit for determining whether a compound is 
an inhibitor or an enhancer of association of a 
vertebrate homologue according to any of claims 1 to 3 
with microtubules or the plus end regions thereof, 

15 which kit comprises at least one transgenic cell 

expressing said homologue and a reporter molecule or a 
cell according to any of claims 12 to 19 and at least 
one cell of the same cell type for use as a control 
and means for contacting said compound with one of 

20 said at least one transgenic cells. 

86. A composition comprising a vertebrate 
homologue according to any of claims 1 to 3 linked to 
a compound identified as an inhibitor or enhancer or 

25 association of said vertebrate homologue with 

microtubules or their plus end regions for use in 
targeting said compound to said microtubule or the 
plus end region thereof. 

30 87. A composition according to claim 86 which 



further comprises a cell transformation or 
transf ecting agent . 



88. A method of targeting a protein to a cell 
microtubule or the plus end region thereof, which 
method comprises introducing into a host cell, tissue 
or organism a transgene comprising a sequence capable 
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of expressing a vertebrate homologue according to any 
of claims 1 to 3, which sequence is operably linked to 
a sequence encoding said protein to be targeted such 
that a chimeric protein is expressed and which results 
5 in targeting said protein to said microtubule or a 
plus end region thereof. 

89. A method of identifying a molecule which 
covalently modifies a vertebrate homologue of UNC-53 

10 according to any of claims 1 to 3 which method 
comprises : 

a) contacting an extract from a cell expressing 
said vertebrate homologue with a mixture of 
enzymes comprising candidate modifying enzymes in 

15 the presence of an inhibitor or covalent 

modification of a protein, 

b) identifying any covalently modified UNC-53 
protein from step a) , 

c) identifying said molecule involved in said 
20 modification step. 

90. A method according to claim 89, wherein said 
indicator comprises 32 p. 

25 91. A method of identifying a compound which 

alleviates or enhances the toxicity of a vertebrate 
homologue according to any of claims 1 to 3, which 
method comprises contacting said compound with a cell, 
tissue or organism according to claim 18, and 

30 monitoring for the presence of said reporter molecule 
adjacent said microtubules or the plus end regions 
thereof . 

92. A vertebrate homologue of UNC-53 protein of 
35 C.elegans or a functional equivalent, derivative or 
bioprecursor therefor encoded by the nucleotide 
sequence in Figure la and which nucleotide sequence is 
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lacking in any of the nucleotide regions from position 
2873 to 3043, 3098 to 3121 or 3518 to 3526. 

93. A vertebrate homologue of UNC-53 protein of 
5 . C.elegans or a functional equivalent, derivative or 

bioprecursor therefor having an amino acid sequence as 
illustrated in Figure lb and lacking in one or more of 
the regions from residues 958 to 1014, 1033 to 1040 or 
1173 to 1175, or which differs from said amino acid 
10 sequences in one or more conservative amino acid 
changes . 

94. A vertebrate homologue of UNC-53 protein of 
C.elegans or a functional equivalent, derivative or 

15 bioprecursor therefor encoded by the nucleotide 

sequence in Figure 1c and which nucleotide sequence 
has from sequence position 1 to 366 replaced with any 
of the sequences identified as variants 1 to 3 of 
Figure lc and/or which sequences lack the region from 

20 position 5624 to 6024. 

95. A vertebrate homologue of UNC-53 protein of 
C.elegans or a functional equivalent, derivative or 
bioprecursor therefor having an amino acid sequence 

25 identified in Figure Id or the sequences of any of 
variants 1 to 3 replacing the amino acids from 
position 1 to 89 of the sequence of Figure Id and/or 
which sequence is lacking the amino acid sequence from 
position 1776 to 1778. 

30 

96. Plasmid pG313303 deposited under accession 
number LM'BP 3936. 

; 97. Plasmid pG13305 deposited under accession 

35 number LMBP 3937. 
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Figure la. Nucleotide sequence of Hs-unc-53/1 



CATGCTGCCCAAGCGCGCCAAGGCGCCCGGCGGCGGCGGCGGCATGGCCAAGGCCAGCGCGGCTGAGCTGAAGGT 75 
CTTCAAGTCCGGCAGCGTGGACAGC CGTGTCCCCGGCGGGCCGCCCGCC TCCAACCTGCGCAAGCAGAAGTCACT 150 
CACCAACCTCTCTTTTCTCACGGACTCCGAGAAAAAGCTGCAGCTTTATGAGCCCGAATGGAGCGACGATATGGC 225 
CAAGGCGCCCAAAGGCTTAGGCAAGGTGGGGTCC AAGGGCCGTGAAGCTCCGCTGATGTCCAAGACGCTGTCCAA 300 
GTCGGAGCACTCGCTCTTCCAGGCCAAGGGCAGCCCGGCGGGCGGTGCCAAGACCCCCCTGGCTCCGCTCGCGCC 375 
CAACCTGGGAAAGCCGAGCCGGATCCCTCGAGGACCCTATGCGGAGGTCAAGCCGCTCAGCAAGGCGCCTGAAGC 450 
GGCCGTGAGCGAAGATGGCAAATCGGACGACGAGCTGCTCTCCAGCAAGGCCAAGGCGCAAAAGAGCTCTGGGCC 525 
TGTCCCCTCTGCCAAGGGCCAGGAGGAGCGCGCCTTCCTCAAGGTGGACCCCGAGCTGGTGGTGACCGTGCTGGG 600 
AGACCTGGAGCAGCTGCTCTTCAGCCAGATGCTGGACCCAGAGTCCCAGAGAAAGAGGACAGTGGAGAATGTCCT 675 
GGATCTCCGGCAGAACCTGGAAGAGACCATGTCC AGCCTGCGAGGGTCCCAGGTGACTCACAGCTCCCTGGAGAT 750 
GACCTGCTACGACAGCGATGATGCCAACCCACGCAGCGTGTCCAGCCTCTCCAACCGCTCGTCCCCTCTGTCATG 82 5 
GCGCTATGGCCAGTCCAGTCCGCGGCTGCAGGCTGGTGACGCGCCCTCTGTGGGTGGGAGCTGCCGCTCGGAGGG 900 
GACGCCCGCCTGGTACATGCACGGCGAACGGGCCCACTACTCCCACACCATGCCCATGCGCAGCCGCAGCAAGCT 975 
CAGCCATATCTCCCGCCTGGAGCTGGTCGAATCCCTGGACTCGG ATGAGGTGGACCTCAAGTCCGGCTACATGAG 1050 
CGAC AGTGACCTCATGGGCAAGACC ATGACGGAGGATGATGAC ATC ACTACCGGCTGGGATGAAAGCAGCTCCAT 1125 
CAGTAGTGGACTCAGCGATGCCTCAGACAATCTCAGTTCAGAAGAATTCAATGCCAGCTCCTCACTCAACTCCCT 1200 
CCCAAGTACTCCCACTGCTTCTCGCAGGAACTCAACAATAGTGCTACGC AC AGACTCAGAGAAGCGCTCACTGGC 1275 
AGAAAGTGGGCTGAGCTGGTTTAGTGAATCAGAGGAGAAAGCCCCTAAAAAACTGGAGTACGACAGTGGTAGCCT 13 50 
GAAGATGGAACCTGGGACTTCTAAGTGGCGGAGGGAGCGGCCTGAGAGCTGTGATGATTCATCCAAGGGTGGAGA 1425 
ACTGAAAAAGCCCATCAGCCTGGGCC ACCCTGGTTCCCTGAAGAAGGGC AAGACCCCACCTGTGGCTGTAACTTC 1500 
CCCCATCACTCACACAGCCCAGAGTGCCCTCAAAGTCGCAGGC AAACCTGAGGGCAAAGCTACAGACAAGGGTAA 1575 
GCTTGCAGTGAAGAATACTGGGCTCCAACGCTCCTCCTCTGATGCTGGTCGGGACCGCCTGAGTGATGCTAAGAA 1650 
GCCCCCCTCGGGCATTGCTCGCCCCTCCACTTCGGGATCCTTTGGCTACAAGAAGCCTCCTCCTGCCACAGGCAC 1725 
AGCCACTGTCATGCAAACTGGTGGTTCAGCCACTCTCAGCAAGATCCAGAAGTCCTCAGGCATCCCTGTCAAGCC 1800 
AGTAAATGGGCGCAAGACTAGCTTAGATGTTTCCAACAGTGCAGAGCCAGGATTCCTGGGTCCTGGAGCCCGTTC 1875 
TAACATCCAGTACCGCAGCCTGCCCCGGCC AGCCAAGTC AAGTTCT ATGAGCGTGACCGGCGGGCGGGGTGGACC 1950 
TCGCCCTGTGAGCAGCAGCATTGACCCCAGTCTCCTCAGC ACC AAGC AGGGAGGCCTTACGCCTTCCAGACTGAA 2025 
GGAGCCTACCAAGGTAGCCAGTGGGCGGACCACTCCAGCCCCTGTCAATC AGACAGATCGGGAAAAGGAGAAGGC 2100 
CAAAGCCAAGGCAGTGGCCTTGGACTCAGACAACATCTCCTTGAAGAGTATTGGCTCCCCAGAAAGTACTCCCAA 2175 
GAACCAAGCAAGCCACCCCAC AGCCACCAAGCTGGCAGAGCTGC C AC C AAC CCCTCTCAGGGCCACAGCGAAGAG 225 0 
CTTTGTCAAACCACCCTCACTAGCCAATCTTGACAAGGTC AACTCCAACAGTCTGGATCTACCATCATCCAGTGA 2325 
TACCACCC ATGCTTCAAAGGTCCCAGATCTGCATGCTACAAGCTC AGC ATCTGGGGGCCCTCTCCCTTCCTGCTT 2400 
CACCCCCAGTCCGGCACCCATCCTCAATATTAACTCAGCC AGCTTCTCCC AGGGCCTGGAGCTAATGAGTGGTTT 2475 
CAGTGTGCCAAAAGAGACCCGC ATGTACCCCAAACTCTC AGGCCTGC AC AGGAGCATGGAGTCCCTCCAGATGCC 2550 
AATGAGCCTCCCCAGTGCCTTCCCCAGCAGTACTCCCGTCCCCACCCCACCTGCTCCCCCTGCTGCTCCCACAGA 2625 
AGAAGAGACGGAAGAGCTGACTTGGAGTGGAAGCCCCAGAGCTGGGCAACTGGACAGTAATCAGCGGGATCGGAA 2700 
CACTCTTCCCAAGAAAGGGCTCAGGTACCAGCTTCAGTCCC AGG AGGAGACCAAGGAGAGGCGACATTCCCATAC 2775 
CATTGGTGGGCTGCCTGAATCCGATG ACCAGTCAGAGCTGCCTTCTCCCCCTGCACTTCCCATGTCTCTGAGTGC 2850 
AAAGGGCCAACTTACCAACATAgtgagtcccactgcggccaccacgccaagaatcacccgctccaacagcatccc 2925 
cacccacgaggcggccttcgagctgtacagcggctcccaaatggggagcaccctgtccctggccgagagacccaa 3000 
gggaatgattcggtcaggacccttccgagaccccacggacgatGTTCACGGCTCAGTGCTGTCCCTGGCCTCCAG 307 5 
TGCCTCCTCCACCTACTCCTCAgc cgaggagagga tgcaa t c tgagCAAATCCGGAAGCTTCGTAGGGAACTGGA 3150 
ATCATCCCAGGAAAAAGTGGCCACCTTGACGTCTCAGCTTTCTGCCAATGCTAATCTGGTGGCTGCTTTTGAGCA 322 5 
GAGCCTGGTGAATATGACATCCCGCCTGCGACACCTGGCAGAGACGGCCGAGGAGAAGGACACTGAGCTGCTGGA 33 00 
TTTGCGAGAAACCATAGACTTTCTGAAGAAAAAGAACTCTGAGGCCCAGGCAGTCATTCAGGGAGCCCTTAATGC 3375 
CTCAGAAACCACACCCAAAGAACTTCGGATCAAGAGACAAAACTCCTCAGATAGCATCTCAAGCCTCAACAGCAT 3450 
C ACTAGCCATTCC AGC ATCGGC AGC AGC AAGGATGCTGATGCGAAAAAGAAG AAAAAAAAGAGTTGGg t C ta tga 352 5 
gCTTCGAAGTTCCTTCAACAAAGCGTTCAGTATAAAAAAGGGGCCC AAGTCAGCTTCCTCATACTCGGATATAGA 3 600 
GGAGATTGCTACACCCGACTCTTCAGCCCCCTCATCCCCCAAACTACAGCATGGTTCTACAGAGACTGCTTCACC 3 675 
CTCCATCAAGTCCTCCACCTYGTCCTCCGTGGGC ACTGATGTCACCGAGGGCCCTGCTCACCCAGCCCCCCACAC 375 0 
TAGGCTGTTCCATGCAAATGAGGAGGAGGAGCCAGAGAAGAAGGAGGTATCGGAGCTGCGCTCTGAGCTATGGGA 3 82 5 
GAAGGAAATGAAGCTTACAGACATCCGCTTGGAGGCCCTCAACTCTGCCCACCAACTGGATCAGCTTCGGGAGAC 3 90 0 
CATGCACAACATGCAGTTGGAGGTGGACCTGCTGAAAGCAGAGAATGACCGACTGAAGGTAGCCCCAGGCCCCTC 3 97 5 
ATCAGGCTCCACTCCAGGGCAGGTCCCTGGATCATCTGCATTATCTTCCCCACGCCGCTCCCTAGGCCTGGCACT 4050 
CACCCATTCCTTCGGCCCCAGTCTTGCAGACACAGACCTGTCACCCATGGATGGC ATC AGTACTTGTGGTCCAAA 4125 
GGAGGAAGTGACCCTCCGGGTGGTGGTGAGGATGCCCCCGC AGCAC ATC ATC AAAGGGGACTTGAAGCAGCAGGA 4200 
ATTCTTCCTGGGCTGT AGC AAGGTC AGTGG AAAAGTTGACTGGAAG ATGCTGG ATGAAGCTGTTTTCCAAGTGTT 427 5 
CAAGGACTATATTTCTAAAATGGACCCAGCCTCTACCCTGGGACTAAGCACTGAGTCCATCCATGGCTACAGCAT 4350 
CAGCCACGTGAAACGAGTGTTGGATGC AGAGCCCCCCGAGATGCCTCCTTGCCGTCGAGGTGTCAATAACATATC 442 5 
AGTCTCCCTCAAAGGTCTGAAGGAGAAATGCGTCGAC AGCCTGGTGTTCGAGACGCTGATCCCCAAGCCGATGAT 4500 
GCAGCACTACATAAGCCTCCTGCTG AAGC ACCGGCGCCTCGTCCTCTCGGGCCCC AGCGGCACGGGCAAGACCTA 4 575 
CCTGACCAATCGCTTGGCCGAGTACCTGGTGGAGCGCTCTGGCCGTGAGGTC ACAGAGGGCATCGTCAGCACCTT 4 650 
CAACATGCACC AGC AGTCTTGC AAGGATCTGCAACTGTATCTTTCC AACCTAGCC AACC AGATAGACCGGGAAAC 472 5 
AGGAATTGGGG ATGTGCCCCTGGTG ATTCTATTGGATGACCTG AGTGAAGC AGGCTCC ATC AGTGAGTTGGTC AA 4 800 
TGGGGCCCTCACCTGCAAGTATCATAAATGTCCCTATATTATAGGTACCACCAATCAGCCTGTAAAAATGACACC 4 87 5 
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CAACCATGGCTTGCACTTGAGCTTCAGGATGTTGACCTTCTCCAACAACGTGGAGCCAGCCAATGGCTTCCTGGT 4950 
TCGTTACCTGAGGAGGAAGCTGGTAGAGTCAGACAGCGACATCAATGCCAACAAGGAAGAGCTGCTTCGGGTC 5025 
CGACTGGGTACCCAAGCTGTGGTATCATCTCCACACCTTCCTTGAGAAGCACAGGAGCTCAGACTTCCTCATCGG 5100 
CGCTTGCTTCTTTCTGTCGTGTCCCATTGGCATTGAGGACTTGCGGACCTGGTTGATTGACCTGTGGAACAACTC 5 17 5 
TATCATTCCCTATCTACAGGAAGGAGCCAAGGATGGGATAAAGGTCCATGGACAGAAAGCTGCTTGGGAGGACCC 5250 
AGTGGAATGGGTCCGGGACACACTTCCCTGGCCATCAGCCCAACAAGACCAATCAAAGCTGTAGCACCTGCCCCC 5325 
ACCCACGGTGGGCCCTCACAGGATTGCCTCACCTCCCGAGGATAGGACAGTCAAAGACAGCACCCCAAGTTCTCT 5400 
GGACTCAGATCCTCTGATGGCCATGCTGCTGAAACTTCAAGAAGCTGCCAAGTACATTGAGTCTCCAGATCGAGA !5 4 7 5 
AACCATCCTGGACCCCAACCTTCAGGCAACACTTTAAGGGTTCGGGAATC ACTGTCACGCCCGGACAGCAGAACG 5550 
CTGGCATCAGGTATCTTAGCTCCTCCTCTCCCCTCTCCTCTTTCAGAGCACTGGCTCTCCAGCCCCAGGAGGAGA 5625 
ACAGGAGGGAGGAGGAGATGAAAGAGGAGGGACAGGTTCTTGGTGCTGTACCTTTGAGAACTTCCTAGGAAGGAA 5700 
TGGTGGGGTGGCGTTTGGGAACTTGTGCCCCCTAAACAC ATTTACTGGCCTCCTCTAATGACTTTGGGGAAAAGA 5775 
TGATTCTGGGTCTTTCCCTTGACTTCTTGTTTCAATTACAAACTGCTGGGCTTTCTGGGGAGGGGTTCAGAAAAC 5850 
ATCAAAACACTGCAGCAGTTCCTAAATGATTCTCAC AAGCAACCCTGAGAGAGACAGTCTTGTGAGGGAGATCTG 5925 
GGGGAGGCAGGAAGCTCCTCAGATTTTCTCACAGACGCTTCCCAATTCCATCACCACTGCCAACACTCGTCCGGA 60 0 0 
ATTC 6004 

In frontal cortex, variants have been found lacking the region from position 
2873 to 3043 or the region from residues 3098 to 3121. The region from 3518 to 
3526 is absent in cDNA from Hela or colorectal adenocarcinoma tissue. All three 
regions are indicated in lower case letters in the figure above. Y at position 
3696 stands for C or T. Both nucleotides have been found to be present in cDNAs 
from different origin. 
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Figure lb. Amino Acid sequence of the protein encoded by Hs-unc-53/1 gene. 
Stretches encoded by the DNA sequences lacking in variants from frontal cortex 
are in lower case letters (residues 958 to 1014 ; 1033 to 1040 and 1173 to 
1175) . The x at position 1232 stands for Leucine or Serine, depending on the 
cDNA of origin. 

MLPKRAKAPGGGGGMAKASAAELKVFKSGSVDS 75 
KAPKGLGKVGSKGREAPLMSKTLSKSEHSLFQAKGSPAGGA 150 
AVSETCKSDDELLSSKAKAQKSSGPVPSAKGQEERAFLKVI)PELV 22 5 

DLRQNLEETMSSLRGSQVTHSSLEMTCYDSDDANPRSVSSLSNRSSPLSWRYGQSSPRLQAGDAPSVGGSCRSEG 300 
TPAWYMHGERAHYSHTMPMRS PSKLSHI SRXiELVESIiDSDEVDLKSGYMSDSDLMGKTMTEDDDITTGWDES SSI 375 
SSGLSDASDITLSSEEFNASSSI^SLPSTPTASRRNSTIVLRTDSEKRSLAESGLSWFSESEEKAPKKLEYDSGSL 450 
KMEPGTSKWRRERPESCDDSSKGGELKKPISLGHPGSLKKGKTPPVAVTSPITHTAQSALKVAGKPEGKATDKGK 525 
IAVKNTGLQRSSSDAGRDRLSDAKKPPSGI ARPSTSGSFGYKKPPPATGTATVMQTGGSATLSKIQKSSGI PVKP 600 
VNGRKTSLDVSNSAEPGFLAPGARSNIQYRSLPRPAKSSSMSVTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLK 675 
EPTKVASGRTTPAPWQTDREKEKAKAKAVALDSDNISLKSIGSPESTPKNQASHPTATKIAELPPTPLRATAKS 750 
FTOPPSLAmiDKVNSNSLDLPSSSDTTHASKTODL^ 825 
SVTKETRMYPKLSGLHRSMESLQMPMSLPSAFPSSTPVPTPPAPPAAPTEEETEELTWSGSPRAGQLDSNQRDRN 900 
TLPKKGLRYQLQSQEETKERRHSHTIGGLPESDDQSELPSPPALPMSLSAKGQLTNIvsptaattpritrsnsip 97 5 
theaafelysgsojngstlslaerpkg^rsgsfrdptddVHGSVLSLASSASSTYSSaeermqseQIRKLRRELE 105 0 
SSQEKVATLTSQLSANANLVAAFEQSLVNMTSRLRHLAET 112 5 

SETTPKELRIKRQNSSDSISSI^SITSHSSIGSSKDADAKKKKKKSWvyeLRSSFNKAFSIKKGPKSASSYSDIE 1200 
EIATPDSSAPSSPKLQHGSTETASPSIKSSTXSSVGTDVTEGPAHPAPHTRLFHANEEEEPEKKEVSELRSELWE 1275 
KEMKLTDIRLEALNSAHQLDQLRETMHNMQLETO 1350 
THSFGPSLAiyrDLSPMTCISTCGPKEEVTLRVWRMPPQHIIKGDLKQQEFFLGCSKVSGK^ 142 5 

KDYISKMDPASTLGLSTESIHGYSISHV7CRVLDAEPP 1500 
QHYISLLLIOIRRLVLSGPSGTGKTYLTN^^ 1575 
GIGDVPLVILLDDLSEAGSISELVNGALTCKYHKCPYI IGTTNQPVKMTPNHGLHLSFRMLTFSNNVEPANGFLV 1650 
RYLRRKLVESDSDINANKEELLRVLDWVPKLWYHL^^ 1725 
1 1 PYLQEGAKIX5I KVHGQKAAWEDPVEWVRDTLPWPSAQQDQSKLYHLPPPTVGPHS IAS PPEDRTVKDSTPSS L 1800 
DSDPLMAMLLKLQEAANYIESPDRETILDPNLQATL 183 5 



SUBSTITUTE SHEET (RULE 26) 



WO 99/63080 



PCT/EP99/03848 



Figure lc. Nucleotide sequence of the Hs-unc-53/2 2 gene 

XARGGCCOGGCGCCTGCTCT GC TACCCGCGCTGCCTTTAGCGGTCGCCCCC^ 

GGAAAGCCCAAGCCCCGGGAGAAGATGCCGGCCATCC^^ 

GTGCACAGCGCCGCGCCCATCCTGC^CCTGCC^^ 

AGCAAGGTGGAGGTGAGCAAGACCACCTATCCTAGCCAGATCCCCCTGAA^ 

GAGCCAGCGC»GGAGGGGCTCCCGCTGCGGAAGAGCGGCT^ 

GACTGGGCC3UlTCATTACCTAACCykAATCCGGCCACA 

(RXCTCCSCCTGGCCCAGATTATCCAGGTTGTGG 

Jtf»TCCCAAATGATTGAAAACATAGATGCC^^ 

TCTGCJVGAAGAGATCAGGAATGGAAACCTCAAGGCCA 

C3^GOUXaUX»GCCCCJU3AAGCAGC^ 



75 
150 
225 
300 
375 
450 
525 
600 
675 
750 
825 
900 
975 



GCAGGCAGCGAGGCCAAAACACGCGGAGGGTCAACTACTGCTAACAA^ 

GATAAATCCAAACCAGTCACCTCCCCACCCCCACCGCCAAGCAGC^ 1050 

TCCTCCCACCCCGGAATCAGTGACAAT^ 1125 

AGTACCTCCTCGGCCATCXCGCAGCCCGGTGCAGC 1200 

JU3TGCCACGGTATC»TGCTCTCGGTCAATC 1275 

CCGGCCCCCAACAATCAGAAGTCCATGCTGGAAAAGCTGAAACTTTTC 1350 

CAGGGGCOGGGGTCCCGGGACACAAGCTGTGAGCGGCTGG 1425 

- G^UiGCCGCwuGTCGCAT^ 15 00- 

CaVGAGCLACTTTTAGCCGGGCACTGA 1575 

CAGCGGGAGAAGGATAAGGAGAAAAGCAAGGACCTTGCCAAGAGA<^ 1650 

GAGGAGCCAAAAGAAGACCCCAGTGGAGCAGCTGTGCCCGAGAT^ 1725 

ATCCCOUUtfXKXaMAAGCTC^ 1800 

GGAATGAAGAGCATGCCCCXXjAAATCCCGAAGTG^ 1875 

AAGCTGAGCTCAGGACTCCCCCAGCAGAAGCCCCAGCTGGACGGCAGACACTCC^ 1950 

TCCTCACaUUSGAAAAGGCCCAGGAGGGACCACCCTGAACCA^ 2025 

GGGAGCAGCCAGACCACAGGAAGCAATACCGTCAGTGTTCAGCTACC^ 2100 

JUVCACTGCCACGGTTGCAC ClVlXi CT GT A 2175 

TCAACAGGTGTGAGCGTGGAGCCCAGCCACTTCACCAAGACTGG^ 2250 

CATCCTGA G GCTCGGCGGCTGCGGACAGTGAAGAACATCGCTGAT^ 2325 

AGTTTAAGGGGAACTCAGGTTACACACAGCACAT^ 2400 

GGCCGTAGCATACTCAGCTTGACAGGGAGGCCCACACCTCTGTCCTG^ 2475 

CAAGCAGGAGACGCCCCCTCAATGGGCAATGGOT 2550 

TCAGGTC GCTATG T CT ACTCCGCCCCTCTGAGAAGGG 2625 

GTCTCAGACAAGGeAGGAGATGAGATGGACCTGGAAG^ 27CC 

GAT G TTCT G AGCAAGAACATCCGGACCGATGACATTACAAGCGGA 2775 

CTCGGAGACGCTGACAGCTGGGACGACAGCAGCTCCGTCAGCAGCGGCATC^ 2925 

ACTGATGACATCAACACCAGCTCCTCC^TCAGCTCCT 3000 

GTGCAGACTGATGCTGAGAAGCACTCACAGGTGGAGAGGAATTC 3075 

GACGGAGGCTCAGACAGCGGCATAAAAATGOAGCCAGGTTCCA^ 315C 

GAKTCCGACAAAAGCACGTCGGGCAAGAAGAATCCTGTCATCTCCC^ 3225 

GCTCAGGTGGGCATCACCATGCCAAGGACGAAGGCTTCAGCCC^ 3 3 CO 

AAAACAGACGACGCAAAGCT^TCTGAGAAAGGAAGGCTTTCTC 3375 

GATGGAGGCCGGAGCAGTGGTGACGAATCCAAAAAGCCCCTC^^ 3450 

AACAGCTTTGGGTTCAAGAAGCAGAGTGGTTCCGCCG^ 3525 

ACCAGK^GGTCAGCCACACTGGGCAAAATC^ 3 600 
AGTATGGATGGGGCTC AGAATCAGGATGACGGGTATCTAGCC CTTAAGCTC CCGGACAAACCTTCAGTAC CGGAGT 3 673 

TTGCCGAGGCCCAGTAAGTCCAACAGCCGGAACGGGGCTGGGAACAGG7CTAGCACCAGCAGCATA 375: 

ATTAGCAGCAAGTCCGCAGGCCTGCCAGTGCCCAAACTGAGGGAGCCTTC 3 825 

GTGAAAGTGAATCCGGCAGCCCAGCCTGTGTCCAGTCCGGCTCAGACCAGTCTCCAGCCTGGA 3973 

GATGTGGCCTCTCCCACACTCCGCAGACTCTTTGGTGGG^ 405C 

AACATGAAAAATTCGGTGGTCATCTCCAATCCTCATGCCACCATGACTCAGCAAGGTAACCTAGAC 412 5 

GGCAGTGGCGTCCTGAGCAGTGGGAGCAGCAGTCCT^ 42 Z Z 

GCCTCCAGCCCCAGCTCAGCCCACTCGGCCCCTC 4 2 7 3 
GCAGTTAGCAAGGATGGCCTGGGCTITCAGTCTGTCAGCAGCCTCCACACCAGCTGTGAGTCCATCGACATCTCC 435 : 
CTCAGCAGTGGAGGGGTCCCCAGCCACAATTCTTCCACTGGCCTCATCGCCTCCTCCAAGGACGACTCCTTGACT 4423 
CCCTTTGTCAGAACTAACAGTGTGAAGACCAGACTGTCAGAAAGCCCTCTCTCTTCCCCTGCTGCTAGCCCTAAG 4 5 C Z 
TTCTGCAGAAGTACTCTGCCCAGGAAACAGGACAGTGACCCGCACCTTGATAGGAACACTTTGCCTAAGAAAGGA 4575 

CTCAGGTATACTCCCACCTCCCAGCTTCGCA^ 4 = 5 : 

GGCCrrCAGGACACCGCTGCCAATTCCCCCTTTTCCTCT 4725 
AACTTTTCCCAGCTTGCGAGTCCCACCACTGTCACCCAGATGAGCTTG7CCAACCCGACCATGCTGAGGACTCAC 4830 
AGCCTCTCCAATGCTGATGGGCAGTATGATCCATACACTGACAGCCGCTTCCGGAATAGCTCCATGTCCC 
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gj^aj^jVGj^ 4350 

TTGcnrrxccA«yu»Tccn^ 5025 

GUbCZQSXSQCZK!^^ 5100 

TTTOAACJUSAGTCTTGGTAA^ 5175 

CTQAMGACSTTXAfiAiUU^CCATTGACCTGCT 5250 

XTTAAObCACCTGAGCXCAACTG^^ 5325 

TCCTOUSACAGCCrrcTCCaCCATCAA^^ 5400 

AAGAAGAAGAAGACMAAGAA^ 5475 

OCAAAA ll^rty i GTC CrCT C ATTCJtG^ 5550 

Cj^^AATCOTICCACACGTTCCACCCCACTGCTGAG 5625 

jk£yr«AAGCTG3U5ACCGTCATGCAGC^ 5700 

GAAGCOT 5775 

CTGAAAGCTGAGAATGATCCXKrXCA^^ 5850 

AlVlVI ^ CTC C CCGAGCCAra 5925 

AT G T TGC TGGATCACACrCXrroAAT^^ 6000 

CAGGMGAJUttCUUU^^ "15 

ACGAAGTGCR^TGTGCTCGATCGGGTCXjTTAGA 6150 

CAGCTAGGGCTCAATTCAGACAGCGTTCTT^ S22 * 
--TGGCTATCTC-trTTG^^^^ 



AACACCCTGGaCTa U^XWiUi rC' IU A G a^ 6375 

Gj^gCACCGTCCGATCATTCTCTCTGGCCCCAgCGGCACTGGG *450 

ATA UUl^TlXAJ AGftGG^ $525 

GAXTTGCGCCXSrrACCTG^ 6600 

ATCATCCTGGACAACCTACACCACCTGAGCTCTCTGGGCGAG^ " 75 

AAATGCCCTTACATAATTGGCACAATGAACCAGGCTACCTCTTCG^ 6750 

AC^ra ^XX^ TTT GTCC CAAC^^ 6825 

GAAACAGAGATCACT«3GCGGGT^^ 6900 

CACCTCAA C X X^ rr C CT GG AGGCTCACAGTTC 6975 

ATCGATGTGGAC GGC TC G AGACjX U I ^ 705 2 

GTCAGAGAAGGACTCCAGCTCTATGGAAGGCGCGCCCCC^^ 7125 

CCATGGGCAGCCAGCCCACAACACCACGAGTGGCCTCCCCTO 72 °° 

AACATGCTGATGAGGCTGCAGGACK3CAKCAACT 7 "0 

AGCCATCACGATGACATCTTGGACTCCTCTTTGGAGTCCAC^ 7 * 2 5 

CCrGCGGACCC TCTTCCTTCC3lCATC 7650 
CTGTGACCTCCCTAAGACACTGAAGATACTTCTCi " 
AAAAAAAAAAAAAAAAAAAAAAA 



a.TCAT CereCGTTCAAATGAAAA AAAAAAAAAAA 772 S 
— — ; ' 7748 



AC multiple positions heterozygous sequences have been observed. The 
ambiguities are denoted xn the IUPAC IUB code*, which are as follows : R 
G ; Y ■ C or T ; W * A or T ; M ■ C or A. 



a A or 



The region between position 542 5 and 5433 is absent in cDNAs from Hela and 
colorectal adenocarcinoma tissue. Other cDNA sources are heterozygous (fragmer 
present and absent) at this position. 

cONA from frontal cortex is heterozygous for the presence or absence of the 
region between 5924 and 6024. Absence of this fragment results in an out-of- 
frame deletion of 101 bp, resulting in a premature stop in translation. 

The sequence in bold corresponds to the fragment in the 3 ' -UTR Hs-unc-53/2 tha: 
was used in RH mapping. The primers used to amplify SHGC-3 3 45 6 are underlined. 
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Three variants have been found for Che 5' end of the gene For these variants, 
the seSence from position 1 to position 366 should be replaced by one of the 
following sequences : 



TG^GAOTTGGTGCTGATTTCCTTGGCTGGCGGGAACTCTGTCTGGCTGTTGCATGCATCACT 
ATT^TCTTCCTCTGTGGATTTGGAAGCATCGCTGAAGG 

TCTGAGTCCAGCCAACAGCAGAAGAGAAAGCCAGTTATCCACGGACTGGAAGATCAAAAGAGG 



75 
150 
213 



tIatactttggggtgcacatggctattgatctctactgcggtttggcttgtctgtggggaatacatgagccccga 75 

iAACAACTGGACTTTATTGAGTGTTTACCATGCACCAAGCCCTGGGCTAAACACTTCATCTGCAGGCT^ 75 

^ActGCAAACCCAGTAGGTAGGTATAACTATCCCCACTCTGCAGATGCAGAAACGGAG^ 150 

gSSctaaa^gctcaccaggaggctagaaggtggccacacctagctggcccccctgactccac^ 225 

°^J^^Stgcaagaa T gtgactccaagtttttccttccttctggatccaactctggcttcactctg 300 

ctcagcaaccag 
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Figure Id. Amino acid sequence of the protein encoded by the Hs-unc-53/2 gene 

mPAILVASKmKSGLPKPVHSAAPILHVPPAI^^ 75 

LRKSGSVENGFDTQIYTDWANHYLTKS^ 150 
DACLNFIJUUCGINIQGLSAEEIRNGNLKAILGLFFSLSRYKQQQQQPQKQHLSSPLPPAVSQVAGAPSQCQAGTP 225 

QQQVPWPQAPCQPHQPAPHQQSKAQAEMQSRLSGPTARVSAAGSEAKTRGGSTTANNRRSQSFNNYB 300 

PPPPPSSHEKEPIASSASSHPGMSDNAPASLESGSSSTPTNCSTSSAIPQPGAATKPWRSKSLSVKHSATVSMLS 375 

VKPPGPEAPRPTPEAMKPAPNNQKSMLEKLKLFNSKGGSKAGEGPGSRDTSCERLETLPSFEESEELEAASRMLT 450 

TVGPASSSPKIALKGIAQRTFSRALTNKKSSLKGNEKEKEK^^ 525 
GAAVPEMPKKSSKIASFIPKGGKLNSAKKEPMAPSHSGIPKPGMKSMPGKSPSAPAPSKEGERSRSGKLiSSGLPQ 600 

QKPQLDGRHSSSSSSIASSEGKGPGGTTIOT 675 
LYRSQTOTEGNVTAESSSTGVSVEPSHFTKTGQPALE^ 750 
HSTLETTFDTNVTTEMSGRSILSLTGRPTPLS^ 825 
PLRRQLASRGSSVCHVDVSDKAGDEMDLEGISMDAPGY 900 
DGMAWRETLQRNTSLGLGDADSWDD^^ 975 
SQVERNSLWSGDDVKKSIXXSSDSGIKMEPGSKWRRNPSDVSDxSDKSTSGKKNPVISQTGSWRRGMTAQVGITM 1050 
RTKASAPAGALKTPGTGKTDDAKVSEKGRLSPKASQVKRSPSDAGRSSGDESKKPLPSSSRTPTANANSFGFKKQ 1125 
SGSAAGLAMITASGVTVTSRSATIX3KIPKSSALVSRSAGRKSSM>3AQNQ 1200 
SRNGAGNRSSTSSIDSNISSKSAGLPVPKLREPSKTALGSSLPGLVNQTDKEKGISSDNESVASCNSVKVNPAAQ 1275 
PVSSPAQTSLQPGAKYPDVASPTLRRIiFGGKPTKQVPIATAENMKNSWISNPHATMTQQGNLDSPSGSGVLSSG 1350 
SSSPLYSraTTOI^QSPIJVSSPSSAHSAPSNSLTWGTN^^ 1425 
HNSSTGLIASSroDSLTPFVRTNSVXTT^ 1500 
LRTQEDAKEWLRSHSAGGLQDTAANS PFSSGSSVTSPSGTRFNFSQLAS PTTVTQMSLSNPTMLRTHSLSNADGQ 1575 
YDPYTDSRFRNSSMSLDEKSRTMSRSGSFRDGFEEVHGSSLSLVSSTSSVYSTPEEKCQSEIRKLRRELDASQEK 1650 

VSALTTQLTANAHLVAAFEQSLGNMTIRLQSLTMTAEQKDSEI^ 1725 
KGNGTAQSADLRIRRQHSSDSVSSINSATSHSSVGSNIESDSKKXKRKNWvneLRSSFKQAFGKKKSPKSASSH^ 1800 
DIEEMTDSSLPSSPKLPHNGSTGSTPLLRNSHSNSLISECMDSEAETV^ 1875 
LDQLREAMNRMQSEIEKLKAENDRLKSESQGSGCSRAPSQVSISASPRQSMGLSQHS 1950 
CSARKEGGRHVKIWSFQEEMKWKEDSRPHLFLIGCIGVSGKTKWDVLTCVVRRLFKEY 2025 
VTjGYSIGEIKRSNTSETPELLPCGYLVGENTTISVTvXGLAENSLD^ 2100 
G PSGTGKT YLANRLS E YI VXREGRELTDGVI ATFNVDHKS SKELRQ YLSNLADQCNS ENNAVDMPLVT I LDNLHH 2175 
VSSLGEIFNGLI^CKYHKCPYIIGTMNQATSSTPNLQLHHNFRWVLCANHTEPVKGFLGRFLRRKLMETEISGRV 2250 
RNMELVKI I DWI PKVWHHLNRFLEAHSSSDVT IGPRLFLSC PI DVDG S RVWFTDLWNYS 1 1 P YLLEAVREGLQLY 2325 
GRRAPWEDPAKWVMDTYPWAASPQQHEWPPLLQLRPEDVGFDGYSMPREGSTSK 2400 
AANYSSPQSYDSDSNSNSHHDDILDSSLESTL 2432 

Putative start methanionines at positions 1 and 10 are in lower cases. The 
residue at position 1018 (denoted by x) is encoded by an heterozygous sequence. 
Both residues Aspartic acid (D) or Glutamic acid { E ) can be incorporated. The 
amino acid sequence VNE at position 1776 to 1778 is present or absent depending 
on the allele from which the protein is translated. 

For translation of the 3 variants described in figure lc, the aminosequence from 
position 1 to 89 has to be replaced by the following amino acid sequences : 

Variant 1 

mESVSESSQQQKRKPVIHGLEDQKR 2b 
Variant 2 

mAIDLYCGLACLWGIHEPr 19 
Variant 3 

mQECDSKFFLPSGSNSGFTLLSNQ 24 
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Figure le. Nucleotide sequence of Hs-unc-53/3. 

TAGAAGCATTTTCTTTGGCAGCAAGAAGATA 7 5 

AGGCAGCCAGCTGTTGGGTCAAAGCCTGTGCATACTGCTCTTCCGATACCAAATCTTGGCACTACTGGGTCACAG 150 
CACTGTTCTTCAAGACCTTTGGAACTTGCTGAAACAGAGAGCTCCAT 225 
ACCTGTGAATTTGGAGAGAAGAAACCCCTCCAAGGAAAAGCCAAGGAGAAAGAAGACAGCA 300 
TGGGCCAACCACTACCTAGCAAAATCAGGCCACAAGCGGCTGATCAAGGACTTGCAACAAGACATTGCAGATGGA 375 
GTACTCCTAGCAGAAATCATCCAGATTATTGCAAATGAAAAAGTTGAAGATATCAATGGATGTCCTA 450 
TCTCAGATGATTGAAAATGTTGATGTCTGCCTTAGTTTTCTAGCAGCCA 525 
GCTGAAGAAATAAGAAATGGAAACTTAAAAGCCATTCTAGGGCTGTTTTTCAGTTTATC 600 
CAACACCATCAACAACAGTACTATCAGTCCTTGGTGGAACTTCAGCAGCGAGTTACTCACGCTTCCCCTCCATCG 675 
GAAGCCAGCCAGGCCAAAACCCAGCAAGATATGCAGTCCAGTCTGGCAGCCAGATATGCAACTCAGTCTAATCAC 750 
AGTGGAATTGCAACCAGTCAAAAAAAGCCTACTAGGCTTCCAGGGCCCTCTAGGGTGCCTGCTGCAGGAAGCAGC 825 
AGCAAGGTCCAGGGAGCCTCTAATTTAAATAGGAGAAGTCAGAGCTTTAACAGCATTGACAAAAAC AAGCCTCCA 900 
AATTATGCAAATGGAAACGAAAAAGATTCCTCCAAAGGACCTCAATCGTCTTCAGGTGTAAATGGTAACGTGCA 975 
CCTCCCAGTACTGCTGGGCAGCCTCCTGCCTCTGCCATCCCTTCTCCAAGTGCCAGCAAGCCCTGGCGCAGCAAG 1050 
TCCATGAATGTCAAACACAGTGCCACCTCCACCATGTTGACTGTAAAGCAGTCAAGTACAGCCACCTCCCCCACA 1125 
CCATCTTCAGACAGACTGAAGCCACCTGTCTCAGAAGGGGTCAAAACTGCTCCCTCAGGACAGAAATCCATGCTT 1200 
GAGAAATTCAAGCTAGTCAATGCCCGGACTGCTTTACGCCCCCCGCAGCCTCCCAGTTC 1275 
GGGAAGGATGATGATGCCTTTTCTGAATCTGGTGAAATGGAAGGTTTTAACAGTGGTC 13 50 

ACAAATAGCAGTCCCAAAGTGTCACCTAAGTTGGCCCCTCCAAAAGCTGGAAGCAAAAATCTCAGCAATAAAAAG 142 5 
TCTTTGCTACAGCCAAAGGAAAAAGAAGAAAAGAACAGGGACAAAAATAAAGTTTC 1500 
GAAGAGAAGGATCAGGTGAC AGAGATGGCTCC AAAAAAGACCTCCAAAATTGC AAGCTTGATCCCTAAGGGCAGC 1575 
AAGACAACAGCAGCTAAGAAGGAAAGCTTAATTCCGTCTTCCAGTGGTATTCCAAAACCAGGCTCTAAAGTTCCA 1650 
ACAGTAAAGCAAACCATTTCACCTGGCAGCACAGCAAGCAAAGAGTCTGAGAAATTCAGGACTACCAAGGGGAGC 1725 
CCTTCCCAGTCCTTATCTAAGCCTATAACCATGGAGAAAGCAAGTGCTTCTAGTTGTCCTGCCCCTTTGGAAGGA 1800 
AGGGAAGCTGGCCAAGCTTCTCCTTCTGGTTCCTGTACCATGACAGTGGCACAAAGCAGTGGGCAGAGCACAGGA 1875 
AATGGTGCTGTCCAACTCCCTCAACAGCAGCAACATAGCCACCCGAATACCGCGACAGTGGCACCATTCATTTAC 1950 
AGGGCACATTCAGAAAATGAAGGTACCGCTTTACCATCGGCTGACTCCTGTACCAGTCCTACAAAGATGGACTTA 2025 
TCATATAGTAAGACTGCTAAGCAGTGCCTGGAGGAGATATCTGGTGAAGGCCCTGAAACAAGAAGAATGAGAACA 2100 
GTTAAAAACATAGCAGACTTGAGGCAGAATTTAGAAGAGACTATGTCCAGTCTTCGTGGGACTCAGATAAGCCAC 2175 
AGCACCCTGGAGACAAC ATTTGACAGC ACTGTGACAACAGAAGTTAATGGAAGGACCATACCCAACTTGACAAGT 2250 
CGACCCACCCCCATGACCTGGAGGTTGGGCCAGGCATGTCCGCGACTTCAGGCGGGAGATGCTCCCTCCCTGGGT 23 25 
GCTGGCTATCCTCGCAGTGGTACCAGTCGATTCATCCACACAGACCCCTCGAGGTTCATGTATACCACGCCTCTC 2400 
CGTCGAGCTGCTGTCTCTAGGCTGGGAAACATGTCACAGATTGACATGAGTGAGAAAGCAAGCAGTGACCTGGAC 2475 
ATGTCTTCTGAGGTCGATGTGGGTGGATATATGAGTGATGGTGATATCCTTGGGAAAAGTCTCAGGACTGATGAC 2550 
ATCAACAGTGGGTACATGACAGATGGAGGACTTAACCTATATACTAGAAGTCTGAACCGAATACCAGACACAGCA 2 62 5 
ACTTCCCGGGACATCATCCAGAGAGGGGTTCACGATGTGACAGTGGATGCAGACAGCTGGGATGACAGCAGTTCA 2700 
GTGAGCAGTGGTCTCAGTGACACCCTTGATAACATCAGCACTGATGACCTGAACACCACATCCTCTGTCAGCTCT 2775 
TACTCCAACATCACCGTCCCCTCTAGGAAGAATACTCAGCTGAGGACAGATTCAGAGAAACGCTCCACCACAGAC 2850 
GAGACCTGGGATAGTCCTGAGGAACTGAAAAAACCAGAAGAAGATTTTGACAGCCATGGGGATGCTGGTGGCAAG 2925 



ACAGGTTCCTGGAGAAGAGGCATGTCTGCCCAAGGAGGGGCGCCATCTAGGCAGAAAGCTGGAACAAGTGCACTC 307 5 
AAAACACCCGGGAAAACCGATGATGCCAAAGCTTCTGAGAAAGGAAAAGCTCCCCTAAAAGGATCATCTCTACAA 3150 
AGATCTCCTTCAGATGCAGGAAAAAGCAGTGGAGATGAAGGGAAAAAGCCCCCCTCAGGCATTGGAAGATCGACT 3225 
GCCACCAGCTCCTTTGGCTTTAAGAAACCAAGTGGAGTAGGGTCATCTGCCATGATCACCAGCAGTGGAGCAACC 3300 
ATAACAAGTGGCTCTGCAACACTGGGTAAAATTCCAAAATCTGCTGCCATTGGCGGGAAGTCAAATGCAGGGAGA 337 5 
AAAACCAGTTTGGACGGTTCACAGAATCAGGATGATGTTGTGCTGCATGTTAGCTCAAAGACTACCCTACAATAT 345 0 
CGCAGCTTGCCCCGCCCTTCAAAATCCAGCACCAGTGGCATTCCTGGCCGAGGAGGCCACAGATCCAGTACCAGC 352 5 
AGTATTGATTCCAACGTCAGCAGCAAGTCTGCTGGGGCCACC ACCTCGAAACTGAGAGAACCAACTAAAATTGGG 3600 
TCAGGGCGCTCGAGTCCTGTC ACCGTC AACC AAAC AGAC AAGGAAAAGGAAAAAGTAGCAGTCTCAGATTC AGAA 3675 
AGTGTTTCTTTGTCAGGTTCCCCCAAATCCAGCCCCACCTCTGCCAGCGCCTGTGGTGCACAAGGTGTCAGGCAG 3750 
CCAGGATCCAAGTATCCAGATATTGCCTCACCCACATTTCGAAGgttgtttggtgccaaggcaggtggcaaatct 382 5 
gcctctgcacctaatactgagggtgtgaaatcttcctcagtaatgcccagccctagtaccacattagcgcggcaa 3900 
ggcagtctggagtcaccgtcgtccggtacgggcagcatgggcagtgctggtgggctaagcggcagcagcagccct 3975 
ctcttcaataaaccctcagacttaactacagatgttataagcttaagtcactcgttggcctccagcccagcatcg 4050 
gttcactctttcacatcaggtggtctcgtgtgggctgccaatatgagcagttcctctgcaggcagcaaggatact 412 5 
ccgagctaccagtccatgactagcctccacacgagctctgagtccattgacctccccctcagccatcatggctcc 4200 
ttgtctggactgaccacaggcactcacgaggcccagagcctgctcatgagaacgggtagtgtgagatctactctc 4275 
tcagaaagcatgcagcttgacagaaatacaccacccaaaaagggactaagATATACCCCATCATCTCGGCAGGCC 43 50 
AACCAAGAAGAGGGC AAAGAGTGGTTGCGTTCTCATTCTACTGGAGGGCTTCAGGACACTGGCAACCAGTCACCT 4425 
CTGGTTTCCCCTTCTGCCATGTC ATCTTCTGC AGCTGGAAAATACCACTTTTCTAACTTGGTGAGCCCAACAAAT 4500 
TTGTCTC ARTTTAACCTTCCCGGGCCCAGC ATGATGCGCTC AAACAGCATCCCAGCCCAAGACTCTTCCTTCGAT 457 5 
CTCTATGATGACTCCC AGCTTTGTGGGAGTGCC ACTTCTCTGGAGGAAAGACCTCGTGCC ATCAGTCATTCGGGC 4650 
TC ATTCAGAGACAGC ATGGAAGAAGTTC ATGGCTCTTCATTATGACTGGTGTCCAGCACTTCTTCTCTTTACTCT 4725 
ACAGCTGAAGAAAAGGCTCATTCAGAGCAAATCCATAAACTGCGGAGAGAGCTGGTTGCATCACAAGAAAAAGTT 4800 
GCTACCCTCACATCTCAGCTTTCAGCAAATGCTCACCTTGTAGCAGCTTTTGAAAAGAGCTTAGGGAATATGACT 4 87 5 




3000 
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AACATCATCACTGTGAACCTCAAAWj<^i^ a x a * _ _ 

AAACCAATTACCCAAAGGTACTTTAACTTGTTGATGGAGCATCACAGAATTATACTCTCAGGACCGAGTGGTACT 6225 

GGAAAGACCTATTTGGCAAACAAACTTGCTGAATATGTAATAACCAAATCTGGAAGG 6300 

ATTGCCACTTTTAATGTGGACCACAAGTCAAGTAAGGAATTGCAACAATATCTAGCTAACCTGGCTGAACAOT 6375 

AGTGCTGATAATAATGGAGTGGAGCTCCCAGTTGTAATAATTCTTGATAATCTTCATCATGTGGGCTCTCTGAOT 6450 

GATATCTTCAATGGTTTTCTCAATTGTAAATACAACAAATGTCCATATATTATTGGAACAATGAATCAGGGAG^ 6525 

TCTTCATCACCAAATCTAGAGCTGCATCACAATTTCAGGTGGGTATTATGTGCAAATCATACAGAACCAGTGAAA 6600 

GGCTTTTTAGGCAGATATCTTCGAAGAAAACTCATAGAGATAGAAATTGAAAGGAACATTCGCAATAATGACCTA 6675 

GTCAAAATTATAGATTGGATTCCGAAGACGTGGCATCATCTCAACAGTTTTTTGGAAACACACA 6750 

GTTACCATTGGTCCCCGACTATTCCTTCCTTGCCCCATGGATGTAGAAGGTTCTAGAGTATGGTTCATGGATCTC 6825 

TGGAACTATTCTTT AGTACCTTATATTCTGGAGGCAGTGAGAGAGGGTCTTCAGATGTATGGGAAACGCAC ACCA 6900 

TGGGAAGATCCTTCAAAGTGGGTGCTTGACACATATCCATGGAGCTCAGCAACTCTGCCTCAGGAGAGCCCAGCC 

TTACTTCAGCTGCGACCAGAAGATGTTGGGTATGAAAGCTGCACATCCACTAAGGAAGCCACAACCTCAAAGCAC 

ATTCCACAAACTGACACAGAAGGAGATCCCCTGATGAATATGCTAATGAAACTCCAAGAAGCAGCCAATTACTCG 



The region from position 3795 Co 4325 consists of two blocks . (3795 to 4283 and 
from 4284 to 4325) that independently can be present or absent in cDNA molecules 
from frontal cortex tissue. Frontal cortex is also heterozygous for the regxon 
from 5153 to 5173. The region from 5343 to 5408 is absent in frontal cortex, 
but heterozygously present in hart cDNAs . 

The nucleotide sequence in heterozygous at position 4509. R is the IUB IUPAC 
code for A or G . Amino acid sequence is not affected. 

An alternative 5' end has been observed. In this variant the sequence trom 
position 1 to 288 is replaced by the following DNA sequence : 

TAGTTTGCTGCTTTTTTGAAGAGATTCCATTTTGAAGGGCAAGAACCTAATGTGATGGATTTATCTTCAGAA 7d 
AACAGACATGGGAAGAATCCAGTGAGTCACAAGCTAGAAGATCAGAAGAAG 
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Figure If. Protein sequence encoded by the Hs-unc-53/3 gene 

mPVLGVASKLRQPAVGSKPVHTALPIPNLGTTGSQHCSSRPLELAET^ 75 
EKEDSKIYTDWANHYIJUCSGHKRLIKDLQQDIAIXSV^ 150 
ARGVNVQGLSAEEIRJKGNLKAIUSLFFSLSRYKQQQHHQQ 225 
AARYATQSNHSGIATSQKKPTRLPGPSRVPAAGSSSKVQGASNLNRRSQSFN^ 300 
SSSGVNGNVQPPSTAGQPPASAIPSPSASKPWRSKSMNVKHSATST^u^TVKQSSTATSPTPSSDRLKPPVSEGVK 375 
TAPSGQKSMLEKFKIjVTIARTALRPPQPPSSGPSDGGKDDDAFSESGEM^ 450 
AGSKNLSNKKSLLQ PKEKEEKNRDKNKVCTEKPVKEEKDQVTEMAPKKTSKIASLIPKGSKTTAAKKESLI PSSS 525 
GIPKPGSKVPTVKQTISPGSTASKESEKFRTTKGSPSQSLSKPITMEKASASSCPAPLEGREAGQASPSGSCTMT 600 
VAQSSGQSTGNGAVQLPQQQQHSHPNTAWAPFIYRAHSEl^GTALPSADSCTSPTKMDLSYSKTAKQCLEEISG 675 
EGPETRRMRTVKNIADLRQNLEETMSSLRGTQISHSTLETTFDSTVT^ 750 
LQAGDAPSI^AGYPRSGTSRFiHTDPSRFWYTTPLRRAAVSRLGNMSQ 825 
ILGKSLRTDDINSGYIOTTCGLNLYTRSL^ 900 
DLNTTSSVSSYSNIWPSRKNTQLRTDSEKRSTTDETWDSPEELKKPEEDFDSHGDAGGKWKWSSGLPEDPEK^ 975 
GQKASLSVSQTGSWRRGMSAQGGAPSRQKAGTSALKTPGKTDDAKASEKGKAPLKGSSLQRSPSDAGKSSGDEGK 1050 
KPPSGIGRSTATSSFGFKKPSGVGSSAMITSSGATITSGSATLGKIPKSAAIGGKSNAGRKTSLDGSQNQDDWL 1125 
HVSSKTTLQYRSLPRPSKSSTSGIPGRGGHRSSTSSIDSNVSSKSAGATTSKLREPTKIGSGRSSPVTVNQTDKE 12 00 
KEKVAVSDSESVSLSGSPKSSPTSASACGAQGLRQPGSKYPDIASPTFRRlfgakaggksasapntegvksssvm 1275 
pspsttlarqgslespssgtgsmgsagglsgsssplfnkpsdlttdvislshslasspasvhsf tsgglvwaaiim 13 50 
ssssagskdtpsyqsmtslhtssesidlplshhgslsglttgthevqsllmrtgsvrstlsesmqldrntlpkkg 1425 
LrYTPSSRQANQEEGKEWLRSHSTGGLQDTGNQSPLVSPSAMSSSAAGKYT!^ 15 00 

SIPAQDSSFDLYDDSQLCGSATSLEERPRAISHSGSFRDSMEEVHGSSLSLVSSTSSLYSTAEEKAHSEQIHKLR 1575 
RELVASQEKVATLTSQLSANAHLVAAFEKSLGNMTGRLQSLTMTAEQKESELIELRETIEMLKAQNSAAQAAIQG 1650 
ALNGPDHPPKDLRIRRQHSSESVSSINSATSHSSIGSGNDADSKKKKKKNWVnsrgselRSSFKQAFGKKKSTKP 1725 
PS SHSDI EELTDS SLPAS PKL PHNAGDCGS ASMKPSQS AS Asp 1 vwppkkrqngpviykhr s r ICECTEAEAEI I 1800 
LQLKSELREKELKLTDIRLEALSSAHHLDQIREAMNRMQNEIEILKAENTDRLKAETGNTAKPTRPPSESSSSTSS 187 5 
SSSRQSLGLSLNNLNITEAVSSDILLDDAGDATGHKDGRSVKIIVSISKGYGRAKDQKSQAYLIGSIGVSGKTKW 1950 
DVLDGVIRRLFKEYVFRIDTSTSLGLSSDCIASYCIGDLIRSHNLEW 202 5 

DSFVFDTLIPKPITQRYFNLLMEHHRIILSGPSGTGKTYL 2100 
QYLANLAEQCSADNNGVEL PWI I LDNLHHVGSLS DI FNGFLNCKYNKC PYI IGTMNQGVS S S PNLELHHNFRWV 2175 
LCANHTEPVKGFLGRYLRRKLIEIEIERNIRNNDLVKIIDWIPKTWHHLNSFLETHSSSDVTIGPRLFLPCPMDV 2250 
EGSRVWFMDLWNYSLVPYILEAVREGLQOTGKRT^ 23 2 5 

STKEATTSKHIPQTDTEGDPLMNMLMKLQEAANYSSTQSCDSESTSHHEDILDSSLESTL 23 8 5 

Regions corresponding to heterozygous sequences encoding presence or absence of 
this region are in lower case letters. These regions are from 132 6 to 1413 ; 
from 1414 to 1427 ; from 1703 to 1709 and from 1768 to 1788. 

Putative start methionines at positions 1 and 51 are indicated in lower case. 

For the variant mentioned in figure le, the amino acid sequence from position 1 
to 81 has to be replaced by the following amino acid sequence : 

mDLSSEmNRHGKNPVSHKLEDQKX 2 4 
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Figure lg. Nucleotide sequence of a 4984 bp fragment from BAC 585E09 (contains 
part of the genomic sequence of Hs-unc-53/1) extending the sequence derived from 
cDNA libraries shown in figure la. 



TTCCTGATCTCAAGAGTTACTCCTTCCCCTACAAAGCCCTCAGCCCCCTCCCCAGTCAACGCTAGGCCCCTTCTC 75 
TCCAAGCCACCCGTGTCCTACCCCCATCCCCTACCTCCTGGGCTCAGGAGGGCAACCTTGAGCCTCAGAGACTGA 150 
AGTAGGGTGGGACTGGGAGTTTCCTGGGGGAAAACCAAAGACGGTTTGGG^ 225 
GGATACCATTCCTCCAGCCCTCTCCCGACATCTCTCTCAGGCCCACGGGCCCACTTTCCCTCCCGCATTCTGAGC 300 
CGCCCTCCCTCCGTCTCTTTTACCTGCACCTCCACACCTCCTCAACAGATCTTTATCCTGGACACGGCAGGGGGT 375 
CCCCGTGCCCTCCGAGAATCGAAGAACCCGCCCCGCTTCTACGCGGAAAGCTG^ 450 
ATTTCCCCCTACCCCCCACTCATCCGCCCCTGGAGCTCCGCTCGCAGATACCTCCCCCTCCCGAGCCAGAAATAG 525 
ACACACTATCTCTCCCCCACCTCCCTCCCCGTGCGCACACTCGCTCCCCTCCTCCTGTTTGCTCCCCGCCTTCCC 600 
CTTCCCTCCTTCCTCTGCTCGGAGCTGCAGCCTGCAGCCTCGACTCGGGCTGGCTGGCTGGCTGAGTGCGGCCGG 675 
GGCGCTGCCCGGCAGTGCGGTGTCCACGGGACTGAC AGGCAGGCAGGCAGGCCGCGGGCTGGGATCCGGACACCA 750 
AAGC AAAAGCACCGCTGGGCGCCGGAGGAGCCGCGGGGCTTCCATCCTTCCTTTGACTC 825 
TTGTATTTTCCCCGCCGCCCCGCCCCTTTTCCTCCGACCCCGCCCTAT^ 900 
TTTTCCCGGCTTCCTTCCTCGCGTTTCTTTCCCCTGCGCCC 975 
CCCCCTTCTCTCCCCTTCTTCCTCGGl^CTTCCGTCCTC 1050 
GCGCTCCCGCCCCCTGCCCCCTCCCCCCGTGCCTGCAGACGCGCGGATCGTCCATGCGCTCCTCGCGGGCAGAAT 1125 
GCTGGGCAGCAGCGTCAAGAGCGTGCAGCCCGAGGTGGAGCTGAGCAGCGGCGGCGGCGACGAGGGCGCGGACGA 1200 
ACCGCGGGGCGCCGGCAGGAAGGCGGCAGCGGCGGACGGCAGAGGCATGCTGCCCAAGCGCGCCAAGGCGCCCGG 1275 
CGGCGGCGGCGGCATGGCCAAGGCCAGCGCGGCTGAGCTGAAGGTCTTCAAGTCCGGCAGCGTGGACAGCCGTGT 1350 
CCCCGGCGGGCCGCCCGCCTGCAACCTGCGCAAGCAGAAGTCACTCACCAACCTCTCTTTTCTCACGGACTCCGA 1425 
GAAAAAGCTGCAGCTTTATGAGCCCGAATGGAGCGACGATATGGCCAAGGCGCCGAAAGGCTTAGGCAAGGTGGG 1500 
GTCCAAGGGCCGTGAAGCTCCGCTGATGTCCAAGACGCTGTCCAAGTCGGAGCACTCGCTCTTCCAGGCCAAGGG 1575 
CAGCCCGGCGGGCGGTGCCAAGACCCCCCTGGCTCCGCTCGCGCCCAACCTGGGAAAGCCGAGCCGGATCCCTCG 1650 
AGGACCCTATGCGGAGGTCAAGCCGCTC AGCAAGGCGCCTGAAGCGGCCGTGAGCGAAGATGGCAAATCGGACGA 1725 
CGAGCTGCTCTCCAGCAAGGCCAAGGCGCAAAAGAGCTCTGGGCCTGTCCCCTCTGCCAAGGGCCAGGAGGAGCG 1800 
CGCCTTCCTCAAGGTGGACCCCGAGCTGGTGGTGACCGTGCTGGGAGACCTGGAGCAGCTGCTCTTCAGCCAGAT 1875 
GCTGGGTAAGTCCTGCCGCCCCGCCCCGCCCCGCCCCTGGCTTTCTCCTAACCAGCTGCTGGGGAAGGTGTGGGG 1950 
AAAGCGAAGCCCCCTCCCCTTGCGCCTTCCCGGAGGGCCCTCCTGTTCACGATC AGGCTGTGATGGGCATTGCGC 2025 
CC AGATGTGCTGAGCTGGCCCACCTCCAGATGCGCATGGCTCAAGTGTACCTTCTTAAAGACATTACAGGCGGGA 2100 
ACCCGGGCTCGACTCTGGCCTGCCCGAGGTGAGGCTGGACAATGGGATGGGGGGTGAGGGGGTTACAGGCTCTCA 2175 
GAAATAGAGCCAGAATCCCAATATGGCAAAACCTGGGACTGGTGGAAACCTCCGTTGTGGTGTGGCCTGCGCTTG 2250 
AC AGGAGCATCCCGCATTGCAAGGGGAGCGTCCAGCGAGAGCCCGGATCTAGAGGACAGATGTGGGAGAGCAGAT 2325 
GTGAGGGCTGATTGGCCCCGGAACAC AGCTGAGGCTCC ACTTCCTCTGTGGATCCCGAGTGGGAGCGCAAGTCGG 2400 
ATTTCCCCGCGGTGTGAGGATTCTGGCTAAAAGAAGCGTCTAGGGCCGGGGGCGGGCGGGCTGCCAGCTGTGCGC 2475 
ATCTGGGCGCATGTCCGATACGTCAGCCCCGGCTCTGGCCCCAACCCCTACACCCGCAGGTCTTTTAGGGCGTGT 2550 
CGAAAGCTCTGGGCGTTAGCGCCGAGACTCCTGTTTGACCGGGAAGCCTTTGCCCTGTGGCTAATGGAAGCCGAG 2625 
C AGGCGGGAAGGGAGGAACAAAGCTTGCTCGAGTGGAGGAAGCGCGCAGAGCTGTTCCATTGTTCTCCGTGCCTG 2700 
AAGAGTCTATGC AAAAAAACCCGAAGCGGGCCCCGGAACTGCTCTTTCTCTCCCCGGAGAGCCCCTGCCCTCAGA 2775 
GAGGAATAGATCTGGGATGTGCCGGACGCCAGGCGGCCATGCCTCGGGAACTGGCACGGGCCCTCTTGGGGCCAC 2850 
GGAACAAGGACGGTGGGGCCTGGTGCCCAGGCGAGCTGCTTTGGCTGCGCGGACTTGTTGCGGTGGCTGGGTTGT 2925 
GGGTCCTCCGGCGCCGAGGGACCCGAGGTTCCTGGGTACCCGGCAGGCTGCCCGCCCGCTGGGGGCTGGGAAGGG 3 000 
GCGTGCCAATGCGCGTGTGAAAGGGCGGGGCCGAGTGACGGGCTTGGTGGGTGGGGAACATGCAAGAGCTCGCCG 3075 
GGCGGCCCTGGAGAATGCGAAGCCGGAGGAGACCGGTTCGGCCTGCTGCAGTCTTCCTAGGAACCCTCGACTCCT 3150 
GTGTGGGCTAGGATGAGGGTCCTCTGACAGGGGCAAGGATTTGGGCCTTTGGAGAACCGATCCTTACGC AGGAGG 3225 
CCGCAAATGGGCTTTGCAGGGGCAATCAGGAGACTGGACAAGGGCAAAAGAAGAGCAGCCTTTTCCCCTGGGAGC 3300 
CCCTCCTGAAGGTGGGGATGGCTGGGTGGGTGCGGAAGCTGACCAGGCAGCCTCACTCTGC AAAGGGAATGTGCC 3375 
ACCCGGTCCTC AGTGTGGGGCTGAGCCTGTCAAAGGCCCTGCCTCAGTGAATGGGGCAAGAGAGAC AATAAGGGA 3 450 
AAAAATTAATAAATTTTTGGCAGGCACCATGGCAGGCACCAAGGAGGGATATGGACAAAATGCAACTGGCCCATG 3525 
TGATAGAGAGCCCTGCTGAGGGACTGAAAGCAGGGTGAGAGAGGAAGGGACCGTGTGTGTGTGTGTGTGTGTGTG 3 600 
TGTGTGTGTGTGTGTGTGTGTGTGTGTGTCAGACTGAGAGAGCGACTTGGCGAGGGAGGGC AAGGGAGTATCGGG 3 675 
GCACAGAATAGCAGAAGGCACAGAACCCTTTTCAGGGTCAAGGCTTTACTTGTTGGGGATAACTTAGCTGGTCTG 3750 
GGTCCTCTCCAGACCTGGATGCCCTC ACACTGTCCCAGAAGCTGACTGCCCATTGAAGCCCTCTTAGTTGCTGCT 3 825 
CAAGAGGAGCCAAACAGGTCTGAGCTGGCTAGGGAGATGGGAGGAGGGGCAGGAGTGGGGAGGAGGGCAGGTGCA 3 900 
GGGAGGGCGAGAGAGGAGGAGAAGCTGAGCTGTGGTCCCTTATTCCTGCTTAGCAGTTGTCACTTCTCAAAGCAC 3975 
ACTGAC ACTTTC AGTAACCTCGGAAGTGAGGAGAGAACACCTCC ACTTCCCAGTTGGGGAAATGCAGAGTC AAAA 4050 
GC ATTGAGGGCCTTAAAGGCATCTATGAGTTCATGGTGGAAGGG AGATTCC AC ATTGATCTCCTGAGGACTAAGT 4125 
CGCAGCTCTTCCTAGGAG ACCTGATTGAGAGAGGAAGAGTC AGC AGGCAGAGGGACCTGCCCAAGGCTATGTCTA 4200 
CTGGGTATGGGCCACC AGAACTGCCTCGATGACCCTAC AGAGGGCTGAGGGGCTTAGCTCTCTGGGGTGGGGAGA 4275 
GAAGGGTGGAAACTCCCAAATCTGCCTGTCTCCAGCTGAGAGGACCCAAAGTTGGGGGGTGGGGAGTTGGTTCAG 4350 
GCTGTAGCAAGGC AGAGCCTGGTGTC AAACAGTGGT AGGGAGGAAAGGAGGGGAGTTGGTGACCTCCAAACTAAG 4425 
CTTTTCCCTGTGTGAAGGGCAGAGGGT AGACTGCCTGGGGGAGGGGTAGAGGGAGAGGAACT ACAGAGGGAATTC 4500 
GTCTTCC AGAGCCAATGATGGTGGTGTTC AGGTATCAGACAGGCCCTC AGTGTACAGCAGGGTGGCCTCTGGGGA 4575 
GAAGAATGGTGACTTG ATGTTTC AGG ATTGTGATTGAAGACACTGGGCATTTGTCCCCACCTC AGTGGGGCTC AG 4650 



SUBSTITUTE SHEET (RULE 26) 



WO 99/63080 PCT/EP99/03848 




TGTCCAGTTATGTTCACTCGATAGTACC ATCCTAGATCCAAGAGGCTGCCAAGAATCAATTTCTGAGGCGGAGGG 4725 
AGGGGGTGGGAGTGAGGCAGCTTCAAGTCAGAGCCTTTCTGTAATAAGAGGGAAGGACTGAAACCTGATCATCCC 4800 
CTTCCCAGAAATCAGCTGGGGTCCCAGATGGTCTAGGCAGGCTCCCTGTCCCTTCGCTAACCTTGGAAGCTGCCA 4875 
AATAACTAGGGC CCCACTGGGGAAC CC TAG CAACTTGGAAGACTGAGGAGTGAGTACCGAGGGCAAATGGGCTAA 4950 
TTCCAGGAATTAGATGCCTCTGGACCCTGGCCCG 4984 



The sequence shown in figure la starts at position 1246. Upstream in the same 
reading frame as used for the translation of the DNA sequence in fig la into the 
protein sequence of fig lb, a stop codon is found at position 815. A first 
putative start codon { ATG) can be found at position 1124. Assuming this start 
codon, the protein sequence from fig lb is extended by the sequence 
MLGSSVKSVQPEVELSSGGGDEGADEPRGAGRKAAAADGRG 

Intronic sequence has been found to start at position 1881. 
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Figure lh. Illustration of a 5'-deletion variant of Hs-unc-53/3 discovered by 
Nagase et al . , (1999, DNA Res. 6:63-70). 



>KIAA0938 protein, amino acid sequence 
MCVTKKLFFIVQRTIFVGCVIWKFCLHYVLRGFLCFNSMQLDRNT 

LPKKGLRYT P S SRQANQEEGKEWLRS H STGGLQDTGNQS PLVS P S AMS SS AAGKYHFSN 
LVSPTNLSQFNLPGPSMMRSNSIPAQDSSFDLYDDSQLCGSATSLEERPRAISHSGSFR 
DSMEEVHGSSLSLVSSTSSLYSTAEEKAHSEQIHKLRRELVASQEKVATLTSQLSANAH 
LVAAFEKSLGNMTGRLQSLTMTAEQKESELIELRETIEMLKAQNSAAQAAIQGALNGPD 
HPPKDLRIRRQHSSESVSSINSATSHSSIGSGNDADSKKKKKKNVJVNSRGSELRSSFKQ 
AFGKKKSTKPPSSHSDIEELTDSSLPASPKLPHNAGDCGSASMKPSQSASAICECTEAE 
AE 1 1 LQLKSELREKELKLTDI RLEALS S AHHLDQ I REAMNRMQNE I E I LKAENDRLKAE 
TGNTAKPTRPPSESSSSTSSSSSRQSLGLSLNNLNITEAVSSDILLDDAGDATGHKDGR 
SVKIIVSISKGYGRAKDQKSQAYLIGSIGVSGKTKWDVTjIXjV"IRRLFKEYvPRIDTSTS 
LGLSSDCIASYCIGDLIRSHNLETVPELLPCGYLVGDNNIITVNLKGVEENSLDSFVFDT 
LIPKPITQRYFNLLMEHHRIILSGPSGTGKTYI^KIAEYVITKSGRKKTEDAIATFNV 
DHKSSKELQQYIJUtfliAEQCSADNNGvTSIjPW 

Y I IGTMNQGVS S S PNLELHHNFRWVT.C ANHTEPVKGFLGRYTiRRKLI E I EI ERNIRNND 
LVKIIDWIPKTWHHLNSFLETHSSSDvTIGPRLFLPCPMDV^GSRVWFMDLWNYSLVPY 
ILEAVREGLQMYGKRTPWEDPSKWVLDTYPWSSATLPQESPALLQLRPEDVGYESCTST 
KEATTSKHIPQTiyrEGDPr^lNMLMKLQEAANYSSTQSCDSESTSHHEDILDSSLESTL'' 

>AB023155 cDNA nucleotide sequence m -„+„+„ mm fift 

ctatcactaa actgtcattg aattgtactg cattagaaag gaactcaaat atgtgtgacg 60 
gcaatggaca tcttgtcacc tttagttggc ctttttcaat gagttaagca ttatatgtgt 120 
gttaccaaaa aattattttt tatagttcag agaaccattt ttgttggatg tgtaatttgg 180 
aagttttgtt tacattatgt ccttaggggt tttctttgtt ttaacagcat gcagcttgac 240 
agaaatacac tacccaaaaa gggactaaga tataccccat catctcggca ggccaaccaa 300 
gaagagggca aagagtggtt gcgttctcat tctactggag ggcttcagga cactggcaac 360 
cagtcacctc tggtttcccc ttctgccatg tcatcttctg cagctggaaa ataccacttt 420 
tctaacttgg tgagcccaac aaatttgtct caatttaacc ttcccgggcc cagcatgatg 
cgctcaaaca gcatcccagc ccaagactct tccttcgatc tctatgatga ctcccagctt 
tgtgggagtg ccacttctct ggaggaaaga cctcgtgcca tcagtcattc gggctcattc 
agagacagca tggaagaagt tcatggctct tcattatcac tggtgtccag cacttcttct 
ctttactcta cagctgaaga aaaggctcat tcagagcaaa tccataaact gcggagagag 
ctggttgcat cacaagaaaa agttgctacc ctcacatctc agctttcagc aaatgctcac 
cttgtagcag cttttgaaaa gagcttaggg aatatgactg gccgattgca aagtctaact 
atgacagcgg aacaaaagga atctgaactt atagaactaa gagaaaccat tgaaatgctg 
aaggctcaga attctgctgc ccaggcggct attcagggag cactgaatgg tccagaccat 
cctcccaaag atcttcgcat cagaagacag cattcctctg aaagtgtttc tagtatcaac 
agtgccacaa gccattccag tattggcagt ggtaatgatg ccgactccaa gaagaagaaa 
aagaaaaact gggtgaactc tagaggaagt gagctgagaa gttctttcaa acaagccttt 1140 
gggaagaaaa agtccaccaa gcctccttca tcacattctg acattgaaga gcttactgat 1200 
tcatcccttc cggcatcccc caagttaccc cataatgctg gtgactgtgg ctcagcatcc 1260 
atgaagccct cacaatctgc ttcagcgatc tgtgaatgca cagaagctga ggcagagata 1320 
attctgcagc tgaagagcga gctcagagaa aaggaattaa. aattaacgga tattcggctg 1380 
gaggccctca gctctgctca tcatcttgat cagatccggg aagccatgaa ccggatgcag 1440 
aatgaaattg aaatactgaa agctgaaaat gaccggttga aggcagaaac tggtaacaca 1500 
gctaagccta ctcggccacc gtcagaatcc tcaagcagca cctcctcttc atcttccagg l56u 
cagtcattag gactttctct aaacaatttg aacatcacag aggctgttag ctcagatatt 
ttgctagatg atgctggtga tgcaactgga cataaagatg gccgcagtgt gaaaattata 
gtctccataa gcaagggcta tggtcgagca aaggaccaaa aatctcaggc atatttgata 
ggatccattg gtgttagtgg aaaaaccaag tgggatgtct tagatggtgt aataagacgt 
ctctttaagg aatatgtatt ccgaattgat acatccacta gccttggtct gagctctgac 
tgcattgcta gctactgtat aggagactta attagatccc ataacctaga agtgcctgaa 
ttgctgcctt gtggatacct tgttggagat aataacatca tcactgtgaa cctcaaaggg 
gtagaagaaa atagtttgga cagttttgtt tttgatacgc tgattcctaa accaattacc 
caaaggtact ttaacttgtt gatggagcat cacagaatta tactctcagg accgagtggt 
actggaaaga cctatttggc aaacaaactt gctgaatatg taataaccaa atctggaagg 
aaaaaaacag aggatgcaat tgccactttt aatgtggacc acaagtcaag taaggaattg 
caacaatatc tagctaacct ggctgaacag tgcagtgctg ataataatgg agtggagctc 
ccagttgtaa taattcttga taatcttcat catgtgggct ctctgagtga tatcttcaat 
ggttttctca attgtaaata caacaaatgt ccatatatta ttggaacaat gaatcaggga 
gtttcttcat caccaaatct agagctgcat cacaatttca ggtgggtatt atgtgcaaat 
catacagaac cagtgaaagg ctttttaggc agatatcttc gaagaaaact catagagata 
gaaattgaaa ggaacattcg caataatgac ctagtcaaaa ttatagattg gattccgaag 
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acgtggcatc atctcaacag ttttttggaa acacacagtt cttctgacgt taccattggt 
ccccgactat tccttccttg ccccatggat gtagaaggtt ctagagtatg gttcatggat 
ctctggaact attctttagt accttatatt ctggaggcag tgagagaggg tcttcagatg 
tatgggaaac gcacaccatg ggaagatcct tcaaagtggg tgcttgacac atatccatgg 
agctcagcaa ctctgcctca ggagagccca gccttacttc agctgcgacc agaagatgtt 
gggtatgaaa gctgcacatc cactaaggaa gccacaacct caaagcacat tccgcaaact 
gacacagaag gagatcccct gatgaatatg ctaatgaaac tccaagaagc agccaattac 
tcgagcacac aaagctgcga cagcgaaagc accagccacc atgaagacat tttggattca 
tctcttgaat ctaccctcta gagggtgaaa aaagttaagg gaaaagactt tgcttttaaa 
aaaatgtttc aaaagaaagg tattttcact aaaccactgc cagtataaaa gcaccctgtc 
aagggccctg acccagagtt gtggtctcca aggaggcagc agaactaagt ctgaaccgcc 
aagatgctaa attgcaatgg aagcttaact ttagtttatt tctaagcatt ttttatatct 
gtggagtaat agaaagctcc attactcaac tggaaaggac cctaatgaca gggcaactga 
acagattgca catgggatag ccaaactgga ctttctttgt ttcctcttta aaagtttaca 
atgcagacca ttttttgtcc cttccttttg tttcctctga ggggctgttc gccccaggca 
gggtccatct ttctgatctg tccaacctcc tttgtgccac acggtgctgg tcacagggct 
tcagtagtgt ttgtgttgtg cgctcacccc attccagaac aaatccaaga ggccagtcct 
ccataagcac aaatggaatt gtgcaaccac cagaaaaaca ctactgtggc aaactggaga 
agtgccaatt taattctaac tgccacgfctc tcatgatgtg ctccaccaac tttttagtat 
atgagtcact ggttttataa ggttgttttt accacagtgg tctttttaaa ccacctgccc 
actcccttaa caagagtttt ataccaatta ttagtcaaca ctgataaaag gcttttttag 
ggctttattt gtttgagcct tttcagtgaa agaaggaaca tttcctatgg tgctgtctca 
ctgccttaaa acagatttct atgacagttt aacagttggt ttaaatccta aaccattggt 
aatttccact gtcttttcat ttacaaccaa gcaacaccag ttaacatagt agcctcatct 
ctatatatct ttctcttttt tttttttttt tgaagaaatg gataggagaa agatcagtat 
ttttagcctt gtgaatagat cgctttgcct atcctccaaa atattaaaat aacccagaaa 
tgctctttga ccgtcactta aaacctaaga catgtggcga aattccatcc agttctaagt 
gaaagagttt cagaaggcag gagattttga attattatcc agcagggctg gaagcactag 
atgcagcatg agcacaacta ttcggctttc cttccctatt gtttttgttt ttttaatgag 
ttttgacgca tgttgttttg attgctattg ttgtacatga gaaattcagc attaaagaac 
actgaagcgg taaggtcact gtggaagagg aagcgtttat actgtaaaag aaggttagat 
ttgcacagtc tactgggtag gtattgtaaa taataatttt taaaacttgc acaaatcaaa 
acaaacacaa acaaaattgt attttatcct attggtgtta agaggtgttt cacttgctga 
gatttcctgt acattgcaaa caaatacaga atgcaaaccc tcaaagctgt attatctggt 
gtgtttgtcc tgtatttaca gttgtttttg actatgcagg agctatcagt gctagagtga 
gcatgcttca aaactgtaca tgaagccaat atatttttgg ataagtaaaa ctgtctgaaa 
gtacatctgt catggcaggc tttaaagaga gtgcatgaaa actgatcagt cattggagaa 
gttaccacca cacacaaagg acaggtttta agtttatgaa acccaagggc taggccatgg 
tatagacttc ttctatgagt gtgtgaaaat gtgttacttt taggacgtgt atttggtgct 
actctctgtg accaccaatg ggtcagttgc tatagaacaa caacaccacg aaacatctgt 
gcagttttca gagtgtcaca aagtcaatag gtccttacac ggtgctattg ccctaaggga 
aatccgaact gaatttatgc acatagaatt gtcaccctga ctttgaagcc tcaaacatgg 
atcaaatctg ttgtgaaaca tcaatatatg tagctggatg agtgactagt ttcccttgta 
taatatgtga tctaagaaaa ttgctaatct ttccctgcca ttttgagaaa cacagtccaa 
acatgagcat aaacagaatt tcctgcaata catcccagta ggtccaccta gtttacaact 
taaactagtt tgtgaaacat ttgtctgtat acattttata ttttgtacat tttgatgtaa 
catatcatgt aaataggcag aaacagtgaa ataaatcatc tgaaaagttt tgtagtcttt 
gtaaagcccc aacaataagt acttggtgtc aatggactta actggatgat gtattttcta 
ttggtttatt gttcctctag cttgtaaacc agcttgcata tatttttttg caaatgtgca 
ccctgtatct gtctaaatta ttactttgcc attaaagtgg aattatttat tgac 
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Figure 2: Illustration of a multiple sequence alignment between the different 
members of the Unc-53 protein family. 



Ce-unc-53 
Ha-unc-53/3 
Hs-unc-53/2 
Hs-unc- 53/1 

Ce-unc-53 
Hs-unc-S3/3 
Hs-unc-53/2 
Hs-unc- 33/1 

Ce-unc-53 
Hs-unc- 53/3 
Ks-unc-53/2 
Hs-unc-53/1 

Ce-unc-53 
Hs-unc-53/3 
Hs-unc-53/2 
Hs-unc-53/1 

Ce-unc-53 
Hs-unc-53/3 
Hs-unc-53/2 
Ka-unc-53/1 



1 MP VLGVASKLRy PA VGSKP\HiTALP I PNLGTTGSQKCSSRPLELAETESSMLSCOIALKS 

MES 



MTTSNVELIP . I YTDWANRHLSXGSLSKS 1 RDI 3NDFRDYRLVS QL INV 
61 TCEFGEKKPLQGKAKEKEDSK . IYTDWANK^'IAKSGHKJU.IKDLQQDIAlXrVT^I^EI IQI 
4 VSE S S QQ OKRK ?VI HGL EDQKRI YTDWANHYLTK SGHXRL I KDLQQ DVTDG VLLAQI X QV 

49 I VP I NE F S PA? TKRLAX ITStJLDG L ETC LO YLKNLGI*DC SKLTKTD ID SGNL GAVL 
120 IA. .NEKV^INGCPRSQSQMIENVUTCLSFLA^^ 
64 VA. .NEKIEDINGCPKNRSQMIE17IDACLNFLAAKGINI0GLSAEEIRNGNLKAIL 

105 QLLFLLSTYKQrXRQLKKDQKiaEOLPTSIMPPAVSKLPSPRVATSATASAT 

17 4 GfcFFSLSRYKQ. . . CQHHQQQ . . YYQSLVELQCRVTH . ASPPSEASQAXTQQOMQS ( SLAA 
113 GLFFSLSRYKQ. . . QQQQPQK . , OKLS . SPLPPAVSQVA3APSQCQAGTPQQQVPV»TPQA 

157 . . NPNSNFF . QMSTSRLQTPQ SRISKIDS . . SKIGIKPKTSGLKPPSSSTTSSNNT . NSF 

228 . .RYATQSNHSGIATSQKXJPT] RLPGP . . S R V? AAG S S S KVQGA SNL. .NRRSQSF 

17 2 PCQPHQPAPHQQSKAQAEMQS RL3GP . TARVSAAGSEAXTRGG STTANNRRSQSF 



Ce-unc-53 211 R ?SSR S SGNNNVGS TI STS A. KS LESS STYS SI StJI-NR. . 

H3-unc-53/3 27? NSXDKNK. . . . PPNYANGNEKDS . 3KGPQSSSG . . VNGNVQFPSTAGC '- . PPAS 

Hs-unc-53/2 226 NNYDXSKFVTSPPPPFSSHEKEPIASSA55HPG . . KSDNAPASLESGSS . STPTNCSTSS 
Hs-unc-53/1 



Ce-unc- 53 24 8 . . PT . . SQLQKFSRPQTOLVRVATTTKiGSSK LAAPKAVSTPKLASVXTI . CAK 

Hs-unc- 53/ 3 322 AI?S? . SAS . KPWRSKSMtAnCHSATST^TVKQS STATS FTPS 3 . . - DRLKP . PVSEGVX 

H9-unc-53/2 283 A1PQPGAAT . KPWRSKSLSVKHSATVSML3VTC PPGPEA. . . PR . . . . PTPEAMK 

Hs-unc-53/1 

Ce-unc-53 297 QEFDNSGGGGGGML.XLKLFSSKNPSSSSNSP . .QPT. . RKAAAVP QQ . QTLSKI 

Hs-unc-53/3 376 TAPSGQK . . . . SMLEKFKLVNARTALRPPQPPSSGPSDGGKDD . . DAFSESGEMEGPNSG 

Hs-unc-53/2 329 PAPNNQK . . . . SMLEKLKLFNSKGGSKAGEGPGSRDTSCERLETLPSFEESEELEAASRM 
Hs-unc-53/1 

Ce-unc-53 346 AAPVKSGLKPPTS . . . KL». . . GS A . TSMSXLCTPKVS YRXTD 

Hs-unc-53/3 43 0 LNSG . • G . . . STNSSPKVSFKItAPPKAGSXNLSNKKSLLQPKEKE EKNRDKNK . 

Hs-unc-53/2 3 85 LTTV. . G . . . PASS S PKI ALKGI AQP.TFS RALTNXXS S L KGNE KE XEKQQREKDKEKSKD 
Hs-unc-53/1 

Ce-unc-53 3 81 

Hs-unc-53/3 478 VCTEK . PVREEKDC VTEMAPKXTSK IASL X PKG 5KTTAAXKESL I P 

Hs-unc-53/2 440 LAXRASVTERLDLXEEPREDPSG . . . AAVPEM. PK2CSSKIASFIPXGGIO-NSAKKEPMA? 

HS-unc- 53 / 1 1 . . MLPKRAKAPGCCGGMAFASAAELKVFKSGSVDSRVPGGPPASNLrRKQKSLT 

Ce-unc-53 3 81 APIISQQDSKRCSKSSEEESGYAGFNSTSPTS5STEGSLM. HSTSSKSS 

Hs-unc-53/3 523 S SSGI PKPGSKVPTVKQTISPGSTASXESEKFRTTKGS PSOSLSKP . IT . MEXASASSCP 

Hs-unc-53/2 496 SHSGIPKPGMKSMPGKSPSAP, . A? SKEGERSRSGKLS SGLF3QKPQLDG . RHSS S3 SSL 
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Ce-unc-53 
Hs-unc-53/3 
Hs-unc-53/2 
Hs-unc-5 3/1 

Ce-unc-53 
Hs-unc-53/3 
Hs-unc-53/2 
Hs-unc-53/1 

C*-unc-53 
Hs-unc-53/3 
Hs-unc-53/2 
Hs-unc-53/I 

Ce-unc-53 
Hs-unc-53/3 
Hs-unc-53/2 
Hs-unc-53/1 

Ce-unc-53 
Hs-unc-53/3 
Hs-unc-53/2 
Hs-unc-53/1 



429 T5DE . KSPS5DDLTLNASIVT . AIRQPIAATFVS . PNIIN KPVEE. .KP.TLA 

581 APLEGEEAGG . . ASPSGS . CTMTVAQSSG . . QSTGNG . . AVQLP . . . 0 . QQQHSHPNTAT 

553 ASSEGKGPGG. . TTUNHSISSQTVSOSVGTTQTTGSNTVSVQLP . . . QPQOOYNHPNTAT 

106 FQAXGSPAGG . ,A. . . . KTPLAPLAPNLGKPSRIPRGPYAEVKPLSKAPEAAVSEDGKSD 

476 VXGVKSTAKKDFFPAV. .PPRDTQPTIG. . V . VSPIMAKKKLTNDPVISEK FEPE 

63 0 VAPFI YRAXSENEGTALPSAD5CT . S . P . . TKMDL . . S . YSZTAKQCLEEISGE . . . GPE 
608 VAPFLYR3QTDTECNV. .TAESSS.T G. .VSVEP. . SKFTICTGQPALEELTGE . . . DPE 
160 DSLLS SKAKAOKSSGPV PSAXGQE . E . RAFLKVDP . . . ELVVTVLGDLiEQLLFSQMLDPE 

526 . . KLQSMS2DTTDV . PPLP . PLKSWPLKMTSIRQP , P7YD VLLXQGKI 

6 80 TRRMRTVK - N I ADLRONLEETMS S LRG TQ I S H STLE . 7TFDSTVTTEVNCRTI PNLTSRP 
657 ARRLRTVK.NIADLRQNLEETMSSLRGTQVTHSTLE . TTFDTOFv TTEHSGRS IL.SLTGRP 
215 SQRKRTVQ . NVLDLRQNLEETMSSLRGSQVTHSSLEMTCYDSD . . . DANPRSVSSLSNRS 

570 T SPVXSFGY 

7 3 8 TPMTWRLGQACPRLQAGDAPSU-AGY . PRSGTSRFIHTOPSRFWYTTPLRRAAVSRLGNM 
715 TPLSWRLGQSSPRLQAGDAPSMGNGYPPRAKASRF1NTESGRYVYSAPLRRQLASRGSSV 
271 SPLS WRYGQ S SPRLQA3DAP SVGGS CRS EOT PAWYMHGERAHYSHTMPMR . .SPSKLSHI 

579 EOSSASEDSIVAHASAQVTPPTKTSGNHSLERRMGKN.KTSE. . SSGYTSDAGVAMCAKM 
7 97 SQIDMSEKA . SS . DLDMSS . EVDVGGYMSDGDILGKSLRTDD . INSGYMTOGGLNLYTRS 
775 CHVDVSDKA . GD . EMDLEGI 5MDAPGYMSDGDVLSKNIRTDD . ITSGYMTDGGLGLYTRR 
3 29 SRLELVESLDSD . EVDLJCS GYMSDSDLMGKTMTEDDDITTG 



Ce-unc-53 636 REKLKEYDDM. .TRRA. . QN GYPDKFEDSSSL5SGISDNNSLDDISTDDL.SGV . . 

Hs-unc-53/3 8S3 LNRIP. . D.TATSRDIIQRGVHD\T\'DADSWDDSSSVSSGLSDT . .LDNISTDDLNTTSS 

Hs-unc-53/2 832 LNRLP. .IX5MAVVRETLQRNTSLGLGDADSWDDSSSVSSGISDT , . IDNLSTDDINTSSS 

Hs-unc-53/1 3 69 WDESSSISSGLSDA . .SDNLSSEEFNASSS 



Ce-unc-53 
Hs-unc-53/3 
Hs-unc-53/2 
Hs^unc-53/i 



685 D. .MATVASKHS • 

90 B VSSYSNITVFSRKNTQ. . LRTDSEKRSTTDE7 . .WDSPEELXXPEE.DFDS . . . HGDAG. 
888 ISSYANTPASSRKNLD. . VQTDAEKHSQVERNSLW . SGDDVKKSDG . OSDSG . IKMEPG. 
3 97 IJ^SLPSTPTASRRHSTX^/LRTDSEKRSLAESGLSWSESEEKAPKJOLtEYDSGSLKMEPGT 



Ce-unc-53 695 

Hs-unc-53/3 9S9 GKWKTVSSGLFEDPEKA . GQKA-SLSVSQTGSWRRGMSAQGG . . AP . SRQKAGTSALKTF . 

Hs-unc- 53 /2 942 SK^mRNPSDVSDDSDKSTSGKXNPVISQTGSWRRGMTAQVGITMPRTKASAPAGALKTPG 
Hs-unc-53/2 E 

Hs-unc-S3/l 457 SKWRRERPSSCDDSSKGGELKKPISLGHPGSLKKGKTPWAVTSPITH . ,TAQSAI>KV.. 



Ce-unc-53 



695 



Hs-unc-53/3 1014 . GKTDDAEASEKGXAPLXCSSLQRSPSDAGKSSGDEGKXF . PSGIGRSTA , TSSFGFKXP 
Hs-unc-53/2 1002 TG KTDDAKVSEKGRJL SPKASQVKR S PS DAGR SS GDES KK PLPS SSRTPTANAKSFG FKKQ 
Hs-unc-53/1 513 AGK . PEGKATDKGKLAVTOTTGLQRSSSDAGRDRLSDAKKP , PSGIARP . STSGSFGYKKP 



Ce-unc-53 695 

Hs-unc-53/3 1071 SGVGSS . AMITSSGATITSGSA.Tl^KIPKSAAIGGKSKAGRKTSLDGSQNQDDVVIiHVSS 

Hs-unc-53/2 1062 SG SAAGLA^rTASGVTVTSRSATLGKIPKSSALVSRS . AGRKSSJOGAQNQDDGYLALSS 

Hs-unc- 53 / 1 S 7 0 F . PATGTATVMQTG GSATL3KIQXSSGI PVXPVNGRX7 SLDVSNS AEPGFIAPGA 

Ce-unc-53 695 

Hs-unc-53/3 1130 KTTLOYRSLPRPSKSSTSGIFGR . GGHRSSTS3ID . SNVSSKSAGATTSKLREPTKIGSG 

Hs-unc-53/2 1121 RTNI.QYRSLPRPSKSWSR . . . NG . AGKRSSTSSID . SNI55KSAGLPVPKLREFSKTALG 

Hs-unc-53/1 624 RSNIQYRS1.PRPAKSSSMSVTGGRGGPRPVSSSIDPSLLSTKQGGI-TPSRLKEPTKVASG 
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Ce-unc-53 595 

Hs-unc-53 / 3 1188 . RSS . FVTVNQTDXEXEK VAVSDSESVSLSGSPXSSPTSASACG . AQGLRQPGS 

Hs-unc-53 / 2 117 6 . SSI . PGLVNQTDKEKG ISSDNESVASCNSVXVNPAAQPVSSPAQTSLQPGA 

Hs -unc -53/1 684 . RTT . PAPVTJQTDREKEKAKAXAVALDSOKI 5LXS IG5PESTPKN QASHPTAT 

Ce-unc-53 695 DYSHFVRHPTSSSSXPRVP 

Hs -unc -53/3 1239 KYPDIASPTFRR ( LFGAXAGGXS ASAPNTEGVXS SSVKPSPSTTLARQG SLESPS SGTCSK 

Hs-unc-53/2 1226 XYPDVAS PTLRR LFGGKP . TKQVPIATAENMKKSWISNPKATMTQCGNLDSP . SGSGVL 

Hs-unc-53/1 735 KLAEL PPTPLRA T. .AXSFVXP PS LANLDKVNSN SLDLPSS 

Ce-unc-53 714 SRSSTSVDSRSRAEQENVYKLLSOCRTSQRGAAATSTPGQHSLRSPG YSSTYS 

Hs-unc-53/3 1299 GSAGGLSGSSSPLFNKPSOLTTDVISLSHSUIS SPASVHSFTSGGLVWAAMMSSSS 

Hs-unc-53/2 1264 S . . . . SG SSSPLYSKNVDLN QSPLAS . . . . SPSSAW5AFSNSLTWGTNASSSS 

HS- unc- 5 3/1 7 74 SDTTHASKV?DLHATSSASGGPL»P5CrT?SPAPILNINSASFSQGLELMSGF 

Ce-unc - 5 3 7 6 6 FHLSVSADKDTM5 . MHSQTSRRPSSQXPS YSG { C FH S LDRKCKLQE F . T3TEHRMAALLS P 

Hs-unc-53/3 1355 AGSKDTP5YQSM7SLHTSSESIDL.PLS HHGSLSGI/TTGTKEVQSL . . LMR . TGSV 

Hs-unc-53/2 1329 AVSKDGLGFQSVS SLHTSCES IDI SL5SGGVP SHNS3TGLIASSKD. DSLTPFVR . TNSV 

Hs-unc-53 /l 826 5VPKETRMYPKLSGLHRSMESLQMPKS LPSAFPSSTFVPTPP . APPA 

Ce-unc-53 824 RRVPNSMS KYDSS) ( AAA1*NASGMSRSMILLESL SFRPFRRHQSPAD3 CIITASPSAPRRS 

Hs-unc-53 /3 1407 RSTL.SES) (MQLDHNTLPKKGLR) YTPSSRQANQEEG 

Hs-unc-5 3/2 13 87 KTtL . SES . PLSS PAASPKFCR5TLPRKQDSD PHLDROTLPKKGLR YTPTSQLRTQEDA 

H3-unC-53/l 872 APTE.EET . EELT WSGSPR AGQLDS NQRDRNTLFKKGLR Y . . . . QLOSQEET 

Ce-unc-S3 6e3 HSPRGPTARIPLSL. . ASSPVHVNNNW } GS YSARSRGGSST CIYGETF 

Hs-unc-53/3 1441 KEWLRS H S TGGLQDTG NQ S P L VS P SAM SS SAAGXYHF3NL VSPTNLSQFNLPGFSMMRSN 

Hs-unc-53/2 1444 XEWLRSHSAGGLQDTAANSPFSSGSSV TSPSGTRFNFSQL AS PTTVTQM5 L SNPTMLKTH 

Hc-unc-53/1 SIS KERRHSHTIGGLPESDDQSELPSPPAi, PKSLSAXGQLWI { VSPTAAT TPRITRSN 

Ce-unc-53 928 OLHRLS, . . DEXSPAHSAKSEZ* . .GSQLSLASTT . .AY 

Hs-unc-53/ 3 1501 S I FAQDS SFDLYDDS QLCG5ATSLEERPRA I S . H SG SFRD SMEE VHGSSLSLVSSTSSLY 

Hs-unc-53/2 1504 SLSNADCQYDFYTDSRFRNSSXSLDEKSRTMS . RSGSFR3GFEE VHGSSLSLVSSTSSVY 
HS -ur.c- 53/1 973 SIPTHEAAFELYSGSQM.GSTLSLAERPKGMI . RSGSFRDPTDD} VKG5VLSLASSASSTY 



Ce-unc-53 95 9 GS LNEKYEHA . IRDMARDLSCYKNTVDSLTXKQ 

H3-UHC-S3/3 1560 ST AEEXAHSE QIKKLRRELVASQEKVATLT , SQL SAN 

HS-unc-53/2 1563 ST PEEXCQ5E . IRXLRRELDASQEKVSALT .TQLTAtf 

Hs- unc- 53 / I 1031 SS { AEERMQSE}QIRXLRRELESSQEKVATLT . SQLSAN( VSAMKYGXIKAVLITIVRQVQPR 

Ce-unc- S3 991 . ENYG A. LFDLFE0KLRKLTQHX3RS>TLKPEEAIRFR^DIAKLRDXSNHI*ASNSAHAKEGAG 

Hs-unc-53/3 1596 AHLVAAFERSWNMTGRLQSLTMTAECK . . . E3ELIELRETIEMLKAQNSAAQAAIQ 

Hs-unc-53/2 1596 AKLVAAFEQSIX5NMTIRLQSLTMTAEQX . . . DSELKE^RKTIELLKKQMAAAQAAIN 

Hs -unc -53 /1 1090 EEJJYl*) AKI»VAAFEQSLVNMTSRLRHLAETAEEX . . . DTELLDLRETIDFLKKKNSEAQAVIC 

Ce-unc-53 1041 ELL RQPSLESVASHRSSKSSSSXSSKQEKISLSSFGKNK 

Hs-unc-53/3 1650 GAI-NGPDHPPR DUURRQHSSESVSSINSA7SHSSICS . . .GNDADSKKKKX 

Hs-unc-53/2 1552 GVINTPELNCKGNGTACSADLRIRRQHSSDSVSSINSATSHSSVGS . . .NIESDSKKXKR 

Hs-unc-53/1 1143 GALNASETTPX ELRIRRQNSSD3ISSLNSITSHSSIGS . . . SKDACAKKKKK 

Ce-unc -33 10 90 KSW(AXSVDSC) IRSSLSX. FTXKXN . K. . . . NYD EAHMPS . . .1. . S . GSQG , . 

HS-unc-53/3 16 99 KNW(VNSRGSE) LRSSFKQAFGKKXSTKPPSSKSDIEELT. . OSSLPASPKLPHNAGDCGSA 

Hs-unc-53/2 1709 KKW{VN E) LRSSFXQAJPGXXXSPKSASSH5DIEEMT . . DSSLPSSPKLPHN. GSTGS . 

K$-unc-53/l 1198 KSWCV. . YE) LRSSFNXAFSIXKGPKSA3SYSZ3IEE IATPDSSAPSSPKLGHGSTETAS? 



Ce-unc-53 



1129 T.L. ...... DN ID VIELKQELKERDSALY 
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Ce-unc-53 1151 EVRLDMLDRAREVDVLRETVNKLKTEiraQLKKEVDKL . . . TNGP . . ATRASSRAS . . . I . 

Ha-unc-53/3 1816 DIRLEALSSAKHLDQrREAM^^ONEISILKAE^ORLKAETGNTAKPTRPPSESSSSTSS 

Hs-unc-53/2 1800 DIRLEAX.SSAHQtJ30I-REAl©n^CSEIEKLKAENORLKSESQGSG.CSRAPSQVS . ..IS 

Hs-unc-53/i 1310 DIRLEALNSAHQLDQL.RETMKNMQLEVDLLKAENDRLKVAPGPSSGST . .PGQVPGSSAL 

Ce-unc-53 1202 . . PVIVD . . . DEHVYDAACSSTS . . . . . . . AS£SSKRSSGCNSIKVTVNV\ .DIAGEI SS 

Xs-unc-53/3 1876 SSSR . QSLGLSLNNLN. XTEAVSS DII»LDDAGDATGHKP . GRSVKI IVS I SKGYGRAK DQ 

Ha -unc- 33/2 1856 ASPR- QSMGLSQHSLN . LTESTSL (DMIXDDTGECSARREGGRHVKIWSFQEEMKWKE ) DS 

Ks-unc-53/1 13 58 SS?R. RSuGLALTHSF . GP5LADT DL5PMDGISTCGPXES . VTLRWVRMPPQHIXKG PL 

Ce-unc-53 1248 IVNPDKEIIVGYLAM3 TSQSCWKDI . DVSILGLFEVYLSRIDVEHQLGIDARDSILGYQI 

Hs-unc-53/3 1923 . . KSQA . VLIGSIGVS . GKTKW . DVLDGVIRRLFKEYVFRIDTSTSLGLSS . DC I AS YC I 

Hs-unc-53/2 1314 . . RPHL . FLIGCIGVS { * ) . GKTKW . DVLDGVVFJILFXEYIIHVDPVSQLGI^JS . DSVLGYSI 

Ks-ur.C-53/1 14 25 . . KQQZ . FFLGCSKVS . GKV5W . KMLDEAVFQVFKDYISKMDPASTLGLST . ESIKGVSI 

Ce-unc-53 13 07 GELRRVIGDSTTMIT5H * . PTDILT . SS?TIR1^MHGAAQSRVDSL\XDMLLPX0MILQL 
Hs-unc-53/3 1587 GDLIR. . . . SHNLEVPELLPCGYI.VGDNNIITVNLKGVEENSLDSFVFDTLIPKPITQRY 
Hs-uttc-53/2 1968 GEIKR. . . . 5OTSETPELLPCGVXVGENTTIS\ f TVKGLA£NSLDSLVFESLIPKPILCRY 
Hs-unc-53/2. 1479 SHVKR. . . . VLDAEPPEMPPCRR, . GVNN. ISVSX^KGLKEKCVDSLVFETLIPKPMMQKY 

Ce-unc-53 1364 VKSILTERRX.VLAGATGIGKSK1AKTLAAYVSIRTNQS . EDSIV . NISIPENNKEELLQ 
Hs-unc-53/3 2042 FNLLMEHHF.IIL3G?SGTGKTYLANKLAEn r rTK3GRXKTEDAlATFNVDHKSSKELQQ 
Hs-unc-53/2 2023 VSLLI EHRF.I ILSGPSGTGKTYLANRL SEYIVLREGRELTDGVIATFNVDHXSSKELRQ 
Hs-unc-53/1 1531 ISIXLKHRRLVLSGFSGTGKTYLTNRI-AEYLVERSGREVTEGIVSTF^ 

Ce-unc-5 3 1421 VERRLEKI^RSKESC IVILDNIPKNRIAFWSVFANVPLQN . . . NEGPFWCTVN 

HS-Unc-53/3 2102 YLANLAEQCSADNNGVELPWIILDNL . .KHVGSLSDIF . NGFL . NCKYNKCPYIIGTKN 
Hs-unc-53/2 2083 YLSlwXADCCNSENNAVDMPLVIXLDNL . . HHVSSLG2IF . NGLL . NCKYKKCPYIIGTMN 
Hs-unc-53/1 1591 YLSNLANQiDRETGIGDVPLVILLDDL . . SEAGSISELV . NGAJL . TCKYHKCPYIIGTTN 

Ce-unc-53 1473 R. . YQIPELQIKHNFKMSVMSNRLE . . . GFILRYLRRRAVEDEYRLTVQMPSELFKII 

Hs-unc-53/3 2153 QGVSSSPNXELHHNFRWVLCA>niTEPVKGFLGRYLRRKLIEIEIERNIRNK. DLVXII 

Hs-unc-53 /2 2139 QATSSTPNLQI^HKNFI^Vr/LCAbnJTEFVKGFL^RFLRRKLMETEISGRVRNM . ELVK.II 

Hs-unc-53/1 1647 Q?VKKTPIWGLhT»SFR21LTFSraT^EPANGFLVRYIJUUCLVESDSDINAN^ . ELLRVL 

Ce-unc-53 152 6 DFF PI AL QAVNNF IEKTNSVDVTVGPRACLNC PL TVEG SREWFI RLWNENFI PYLERVA 

Hs-uac-53/3 2215 DWI FXTWHHLNSFLSTHS S SDVTIG PRLFLPCPMDVEG SRWFMDLWNYSLVPYILEAV 

Hs-unc-53/2 2196 DWIPKVW^LOTFLEAHSSSDVTIGPRLFLSCPIir^^^ 

Hs-unc-53/1 1704 I^'PKLWYKLHTFLEKHSTSDFLIGPCFFLSCPIGIEDFRTWFIDLWNNSIIPYLOSGA 

Ce-unc-S3 158 5 RDGXXTFGRCTSFEDFTDIVSKKWPWFDGENPEN . . . .VLXR^QLQDL VPSPAN 

Hs-unc-53/3 2 274 RECLQMYGKRTFV/EDPSKWVLDTYFW . . SSATLPQESFALLCLRPEDVGYESCTSTXEAT 

Ks-unc-53/2 2255 REGLQLYGRRAPWEDFAKWVMDTYPW . . AASPQQHEWPPLLQLRPEDVGFDGYSMPREGS 

Hs-unc-53/1 1763 KDGIKVHGOKAAWEX)?VEWVPJDTLFW. . PSA . . QQDQ SKL YHL FPPTVGPHS I AS PPEDR 

Ce-unc-53 1635 SSRjQ HFNPL . ESLIQL.HATKH . . , QTIDHI 

Hs-unc-53/3 2332 TSKHIPQTIX'EGDPLMNMLHKLQEAANYSSTQSCDSE . . STSHHEOILDSSLESTL 

Hs-unc-53/2 2313 TSXQMPPSDAEGDPLKNMI-MRLQEAANYSSPQSYDSDSNSNSHHDDILDSSLESTL 

Hs-unc-53/1 1819 TVKDSTPSSLDSDPLKAMLLKLQE^ANY- . IESPDRET ILDPNLQATL 
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Figure 3: Illustration at a auiitiple sequence evlagnraens be-wsar, C. eiegan3 Unc-53 
jC-sJ and C. Brigrgaae Unc-53 fCb) - — 

Cb 1 OTTSKVELIPIYTDWAMRJil-SK^ 

Ce 1 MTTSlT7SLIPlVTOWXNRKiSKRSl.SK5IROI SNDFRDYSU-VSQLIJMVIVPI KEFS E AFT KKLAKITSNLD6IJSTCLDY1. 

Cb 81 KNLGLD C SKI-TXTD IDSCntfLGAVLQLX FLL SS YXQKLRQI-XXJ^KXLSwL FVTTTTTA IHP PAVSNIPYSRLFS PSVP PA 
C* 8i ja^GU>CSXI,TXTDtDSGNLGAVTQ^ TSIMPPAVSiaS SPRVATSATASA 

Cbi6: S^BJSNTTQMSTSRLOTPCSRXSKPDSTKlGIKPXTTSr-LRPP . STTSSNTN2NSFRPSSF-SSGNNNVOSTISTSARSXJD 

CeiS6 TCPNSNrFQKSTSRLQTPQSRISKISSSiUGlKPX.TSGI^^ 

Cfc240 S S SAYS SI SBTLS Rr TPSSQI aKPTSRLQTC^VRVATTTFI SS SFJUAA PKAV STPI^ASVXT IKTTTTEHDNS . . . .GGKL 

C B 235 SSSTYSSISJCAW?. . SSI.QXP. SKPQTCLVRVATTTKIG5SKI^APKAV3TPK^A5^;TX . GAXQEPDNSGGGGGGKL 

Cb316 XLKLFSSKNASSSNNrPQPX-RKA EQ . . SK IAAPVTTGLKPPT S ST MKI*G irjr£MSKLCT PKVS DTLI-KTK5 

Ce311 KLlXFSSKNPSSSSNSPQPTRKAAAVTQQQT^^ • • KLG S A TS MSTXCT PKVSYK XT DA? I ZSQQ 

Cb3 33 DSKRCSKSSEE2SGYASF*?STS?ASSST£GSI.5KHSTS5XSSTSDX^ 

C*338 DSXRCSKSSEEESGYAGFNSTSPT SS STEGSLSKHSTSSHSSTSDEKS PSSDDLTLNS S I VT AI RQF I AATFVSFNIXNK 

Cb4€3 PUKKPTUAVKGV. SASFJ^PPPTVTHITNCFTIGWSFI^ 

CS468 PVl^FPTLAraGVKSTAXXDPPFAVTPKDTQP . . PEPEKMS"SID7TDVFF£-& . PL 

Cb^46 FLSI-EJ?VPPKMTPI*QPP7YDVL^^ . V^MAPPVQ^TSAGQSSWERRIQIC^T 

CcS45 KSV . . VPLRH7S IRCPPTYDVZ-LKGOKITSFVKSFGYEQ - . SSASECSIVAHASAQVTPPT . KTS . GNKSLERRMGXNKT 

Cb62 4 SES SGYASI^GVAMCAKMKEK^KZ\TDMTRRAQNGYPDNFED3 S SJLS SGI SDNCTELDDI S 7DDL5G IDMATVASKKSDYS 
Ca6I9 SES SGYTSDAGV'AJlC^JQiR£lC.J^YLnOTRRAONCiYPI5HPEDSS SLSSGISBKN21-DDISTDDLSGVDMATVA5KHSDY2 

Cb?04 KFVRHTSSS3SRPRVPSP.FSTSVDS~SRVEQENV^ 

Ce699 KFVRKPTSSSSKPRWSK5STSVCSRoK^QENVYKL ■ ATSTFGvKSLRSPSYSSYSPHLSViADKDT 

Cb76* MSMKSQTSRRPS SQKPSYAGQFHSL2RKCHL0EF7SAEHRMAAIJLSF RP.VPNSKS KYES S SG S YSARS KGGS ST GITGE P 
Ce77 8 MSMK S2TSRRPS SYS GQFKS LEP-KCKuQEFTS TEKRMAAL.L3 PRRVPNSMSJTiE . SSGSYSARSRGGSSTGITOST 

Cb864 FQtJKRLSDE>ZSFAHSAR5EK'3SC;,SIJlSTTAYGTI2^ 

Cfi857 FGLKPXSDEXSPAKSAJC5BKG£QL£I*ASTTAYGSL!TEKYEHAIRI5MARCL ECYKnTVLSLTKKQSJfJfGALFDLFEQJXRlC 

Cb944 LTS^IDRS^KTBEArRFRCDIAHIiRSISNH^ 
C*927 LTQKIDRSNLKPEEAZ^ RCDIAHIJOIS^^ 

Cbl C2 4 KNKK5 WTRS SI, SKFTIGCTO^KNTOEGHMPS 3 SGSQGTI*DWIDVT ELROEL'l'^R^SAL\'XVRXX)NI-DRAREVDVLKETWKL 
C C 1 0 1 7 KNKKS WXRS SLSKFTKKXNXNYDEAHKPS 1 SGS QGTLDH2DVI £I^GELKERI>3>XYEVRiDN^RAREVD^R2TVI^ 

CC1104 KNETJKQI/K2C£VDK2,TN7 3TTRAS SRASLPYI QDDEHVYDHAC S5TSA5QS 5XRSSGCNS IXVTVNVDXAGEIS 51 VNPDR 
C«1097 KTETOQLK3^S\TKLT WPATR/^ £RA5 IPVIYDDEH\*YDAftC S ST S ASQS SKR SSCCNS I KVTVNVDIAGEI SSI VNFDJt 

Cbll84 EIIVGYLSMPAMNSTWKDlCDSIlXiSPEKYLS 
Ceil77 EIIVGYtAHSTSQSCkTCDIDVSILGLFEVYLSRIDV^ 

Cbl2 64 I RMFKYGAAQ S KLL P RG*H I*Q L VKS 2 VTE RRLVLAGA TG I GK S 2C*AK 7I-AA YYS TNQ 5EDKIVNI TI PEN 
Cal2 57 IIOCTMKGAAQSRVIDSLVLEMLLPKQI^I^LVKSILTERKL 

Cbl* 4 4 CTXSLLgVER * r ^E KI T ? 5K EACYVT1 PN"T PKNRIAFW SVFANVPLQNIJEGPFVVCTViJRYQI PELKTI FNFKMSVXSNH 
C e 1 3 3 7 KXEELLQ VERRLERI LR5XES C XVI LDNZ PKNRIAFWSVFW7VPL0NNEG PF VVCTVNRYQ I PELQ IHHNFKMSVMSMR 

Cbl424 LEGF2U*YI-RRRAVED£YRI,SV'^£SLS^ 

Cei4 3.7 LEGFIIJ*YLRRRAv^DE¥T*LTVQKFS£I^KIirF?PIA^^ 

CblS0<l QNFIP\^RVARTCKKTLGR:rrsrESpTDI\rrEKrf^ 
Cel497 £tf iPY^ERVARDCKTTFGRCrTSFZCPTDr/SXXWXWTC^ 
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Figure 4. Prosite Signatures 
Block A, Large zamiiy; 

IYTDWANXXLX ! K , R > ( A . G . S . T ) XXX ( K , R ) X { ILVA ) (H,X,R,T,S>D(I,L) XXDXXDXXL < L . V 
) (A, S ) (N, D , Q , E ) (I,L,V.A)I(N,D,Q,£) ( I , L , V . A > ( I , L , V , A) (V,A,?,S)X(17 # 19) < 
l.L.t) (N,fi r Q, E)X(I,L, V, A) <N,D,G.E)XCLXXLXXX{A,G, S. T) ( X , L , V . A J X ( 4 , 5 ) { I . 
L, V, A) ■ (S.T)XX(N, D, Q, 2) IXXGXLXA(V, I ) LXL ( L , F ) FXL SX { Y , F>KQ 

Block B, Varcebrate: 

PEXXRXRT V (Q, K) N I I ,L,V, A) ( I . L , V , A3 DLRQNLEETMS SLRG ( S , TJ Q ( V , I ) (S,T)HS(S,T 
J LEX C 0 . 1 ) T 

Blocfc C. Vertebrae©: 

RX < S , T > P I L , M ) (Si T) WRXGQ ( S . A ) XPRLQ AGDA PS 
BlocJc D , Vertebrate: 

GYM5DXD(K.L,V, I J (M,L, V. I) ( A , G . S . T ) KXXXD ■ .2 . 3 ) I < N , T ) f A , G . S , T ) G ( Y . - ) 
Block E, Vertebrate: 

WD ( D » E ) SSS ( M , L , V , I) 5SG (L , I) SDXXDN (I*. I) S ( £. T) (D, E> CD,E}XN(A,G,S,T) ( S, T) 
SS 

Block F , Vertebrate: 
DRNTLPKXGLRY 

Block G , Large family: 

GSX ( I , L , V, A) SL (X, L, V. A) S <A,G. S, T> (A,G, S. T) S (0, 2 > XY ( A, G. 5 . T>XX(E, N) E(K, 
R)X(4,$)I{R.H>X(L,M)XR(D,E> LXXXXXXVXXL7XXXXXXXXLXXXFE (Q,K) ( S . K ) LXXXTXX 
( L , I)XX(L,S)XXXXE(Q,E)XO , 6) <D.E) ( L , I ) XXLRXXX \ S . D , Q , E > XLXXXX ( A , S ) XA ( N , 
D, 2 , E ) XXXXXX I L , I ) X ( 0 , 2 1 ) RQXSX (N , D . Q . E / 5 < I , V) XSXXSXXSXSSX ( A , 0 . S , T ) S 
Block G , Vertebrate: 

SGSFRDXX <D, E) ( ! E,D) VHGSXLSL ! V , A) SS(T.A) 3SXYS { T , 5 ) XSE < K . R ) XXS E < Q , - 

J X ( R , H ) KLRRELX ( A , S ) 3QEXVX <T,A)LT(T, 3)QL ( S, 7) ANAXLVAAFE ( Q . X ) SLXN(M,L, V, 

I ) MTXRL ( Q , R ) XLXXTAE C 3 . E ) XXXELXXLRXT I \t>.Zt XLKXXN ( A , S) XAQAXIXGX (L, I )N{A, 

G,S,T)X(N,;j,Q,E)XXXKXlC, 8) ( N . D . Q , E ) LRI ( K . R ) RQXS S ( N ( D , Q , E ) S < I , V ) SS { I , L > 

NSXTSHSSXGS 

Block H , Large Zamiiy: 

<V, L ) D5XVX ( D , E) XL { I . L) P XX ( M , L , V , I) XXXXXXX (L. XJ lM, L, V, I ) XXXR (I, L) ( I , V) L 
(A. S ) GX ( T , S)GXGX(T, S ) XL ( A . T ) XXLXXY ( M , L . V . I ) XX ( R , K ) 

and 

P {E,N)XX (I . L ) HXXF < K . R ) XXX ( A , S ) NXXEX ( 0 , 3 ) GF < L . I ) XP- { Y . F ) I. (X , R ) (K - R) (K,R) 
X(M.L,V,I) (O.E) 

and 

V ( 1 , L) EXXX (T, S ) X O, E) XXXGPXXX { L , I ) XCP < M , L , V , I ) X ( V . I ) (D,E)XX(R, X) XWFXXL 
WNXXX (1 . V) PY < L, X) XXX ( A , V) ( R , K ) ( D , E ) GXXXXGXX { T , A ) X ( F, Y, W J EDP 

Block H. Vertebra-e: 

{ V. LI DSXVF ( D , E) (T. S) LIPKP (M , L , V , I) XQXYXXLL <M. L . V, I ) XHXR (I , L) (I , V) LSGPS 
GTGKTYM K, T) NRLXSY CM. L . V , X ) XX ( R , X ) GR 

and 

VI { X , L) LD (D.N) LXXXXS i I . L ) XX f I , L } XNCXLXC KYXKC P Y X IGT ( T , M ) NQXXXX ( T , £ ) PNXX 
LHXXFRXXXX ( A , 3) NXXEP \ A , V) XGFLXR ( Y • F) L ( X f R ) (K,RJ (K r R)L(M.L.V ( I> (D,E> 

and 

(R.K) (V,I) \L,Z} DWXPXXWXH { I . L } XXFLEXHS { T . S i SDXXIGPXXFLXCP (M.L.V, I}X{V,I 
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Figure 7b: Illustration of the plasmid map and the 
nucleotide sequence of the pGl3303 expression vector (C- 
terminal Hs-unc-53/3 fragment in fusion with GFP) 

Hs -unc - 33/3 ^ 

/ ^ — — ' \ 
linker Pcmv 

eGFP 

ID PGI3303 circular DNA; 8140 BP 

FT CDS 1225. .2019 

FT /vntifkey=-4" 



FT promoter 3330.. 3918 

FT /vntif key- "29 H 

FT /label= Pcmv 

SQ SEQUENCE 8140 BP; . 

CTAGATAACT GATCATAATC AGCCATACCA C ATTTGT AG A GGTTTTACTT GCTTTAAAAA 60 

ACCTCCCACA CCTCCCCCTG AACCTGAAAC ATAAAATGAA TGCAATTGTT GTTGTTAACT 120 

TGTTTATTGC AGCTTATAAT GGTTACAAAT AAAGCAATAG CATCACAAAT TTCACAAATA 180 

AAGCATTTTT TTCACTGCAT TCTAGTTGTG GTTTGTCCAA ACTCATCAAT GTATCTTAAC 240 

GCGTAAATTG TAAGCGTTAA TATTTTGTTA AAATTCGCGT TAAATTTTTG TTAAATCAGC 300 

TCATTTTTTA ACCAATAGGC CGAAATCGGC AAAATCCCTT AT AAAT C AAA AGAATAGACC 360 

GAGATAGGGT TGAGTGTTGT TCCAGTTTGG AACAAGAGTC CACTATTAAA GAACGTGGAC 420 

TCCAACGTCA AAGGGCGAAA AAC CGTCT AT CAGGGCGATG GCCCACTACG TGAACCATCA 480 

CCCTAATCAA GTTTTTTGGG GTCGAGGTGC CGTAAAGCAC TAAATCGGAA CCCTAAAGGG 540 

AGCCCCCGAT TT AG AG CTTG ACGGGGAAAG CCGGCGAACG TGGCGAGAAA GGAAGGGAAG 600 

AAAGCGAAAG GAGCGGGCGC TAGGGCGCTG GCAAGTGTAG CGGTCACGCT GCGCGTAACC 660 

ACCACACCCG CCGCGCTTAA TGCGCCG CT A CAGGGCGCGT CAGGTGGCAC TTTTCGGGGA 720 

AATGTGCGCG GAACCCCTAT TTGTTTATTT TTCTAAATAC ATTCAAATAT GTATCCGCTC 780 

ATGAGACAAT AACCCTGATA AATGCTTCAA TAATATTGAA AAAGGAAGAG TCCTGAGGCG 840 
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GAAAGAACCA 
CAGGCAGAAG 
CAGGCTCCCC 
TCCCGCCCCT 
CCCATGGCTG 
TATTCCAGAA 
GAGACAGGAT 
GCCGCTTGGG 
GATGCCGCCG 
CTGTCCGGTG 
ACGGGCGTTC 
CTATTGGGCG 
GTATCCATCA 
TTCGACCACC 
GTCGATCAGG 
AGGCTCAAGG 
TTGCCGAATA 
GGTGTGGCGG 
GGCGGCGAAT 
CGCATCGCCT 
TGACCGACCA 
ATGAAAGGTT 
GGGATCTCAT 
AGACAATACC 
GTGTTGGGTC 
CCCCACCGAG 
CCCCAAGTTC 
TAGCCTCAGG 
GGATCTAGGT 
CGTTCCACTG 
TTCTGCGCGT 
TGCCGGATCA 
TACCAAATAC 
CACCGCCTAC 
AGTCGTGTCT 
GCTGAACGGG 
GATACCTACA 
GGTATCCGGT 
ACGCCTGGTA 
TGTGATGCTC 
GGTTCCTGGC 
CTGTGGATAA 
TTAGTTCATA 
GGCTGACCGC 
ACGCCAATAG 
TTGGCAGTAC 
AAATGGCCCG 
TACATCTACG 
GGGCGTGGAT 
GGGAGTTTGT 
CCATTGACGC 
TTAGTGAACC 
AGCTGTTCAC 
AGTTCAGCGT 
TCATCTGCAC 
ACGGCGTGCA 
CCGCCATGCC 
ACAAGACCCG 

agggcatcga 
acagccacaa 
agatccgcca 
cccccatcgg 
ccctgagcaa 
ccgccgggat 
aagcttcgaa 
cctcacccac 
ctaatactga 

GGCAAGGCAG 
TAAGCG G C AG 
GCTTAAGTCA 
TCGTGTGGGC 
AGTCCATGAC 
GCTCCTTGTC 
GTAGTGTGAG 
AGGGACTAAG 
TGCGTTCTCA 



GCTGTGGAAT 
TATGCAAAGC 
AGGAGGCAGA 
AACTCCCCCC 
ACTAATTTTT 
GTAGTGAGGA 
GAGGATCGTT 
TGGAGAGGCT 
TGTTCCGGCT 
CCCTGAATGA 
CTTGCGCAGC 
AAGTGCCGGG 
TGGCTGATGC 
AAGCGAAACA 
ATGATCTGGA 
CGAGCATGCC 
TCATGGTGGA 
ACCGCTATCA 
GGGCTGACCG 
TCTATCGCCT 
AGCGACGCCC 
GGGCTTCGGA 
GCTGGAGTTC 
GGAAGGAACC 
GTTTGTTCAT 
ACCCCATTGG 
GGGTGAAGGC 
TT ACT CAT AT 
GAAGATCCTT 
AGCGTCAGAC 
AATCTGCTGC 
AGAGCTACCA 
TGTCCTTCTA 
ATACCTCGCT 
TACCGGGTTG 
GGGTTCGTGC 
GCGTGAGCTA 
AAGCGGCAGG 
TCTTTATAGT 
GTCAGGGGGG 
CTTTTGCTGG 
CCGTATTACC 
GCCCATATAT 
CCAACGACCC 
GGACTTTCCA 
ATCAAGTGTA 
CCTGGCATTA 
TATTAGTCAT 
AGCGGTTTGA 
TTTGGCACCA 
AAATGGGCGG 
GTCAGATCCG 
CGGGGTGGTG 
GTCCGGCGAG 
CACCGGCAAG 
GTGCTTCAGC 
CGAAGGCTAC 
CGCCGAGGTG 
CTTCAAGGAG 
CGTCTATATC 
CAACATCGAG 
CGACGGCCCC 
AGACCCCAAC 
CACTCTCGGC 
TTCTGCAGTC 
ATTTCGAAGG 
GGGTGTGAAA 
TCTGGAGTCA 
CAGCAGCCCT 
CTCGTTGGCC 
TGCCAATATG 
TAGCCTCCAC 
TGGACTGACC 
ATCTACTCTC 
ATATACCCCA 
TTCTACTGGA 



GTGTGTCAGT 
ATGCATCTCA 
AGTATGCAAA 
ATCCCGCCCC 
TTTATTTATG 
GGCTTTTTTG 
TCGCATGATT 
ATTCGGCTAT 
GTCAGCGCAG 
ACTGCAAGAC 
TGTGCTCGAC 
GCAGGATCTC 
AATGCGGCGG 
TCGCATCGAG 
CGAAGAGCAT 
CGACGGCGAG 
AAATGGCCGC 
GGACATAGCG 
CTTCCTCGTG 
TCTTGACGAG 
AACCTGGCAT 
ATCGTTTTCC 
TTCGCCCACC 
CGCGCTATGA 
AAACGCGGGG 
GGCCAATACG 
CCAGGGCTCG 
ATACTTTAGA 
TTTGATAATC 
CC CGT AG AAA 
TTGCAAACAA 
ACTCTTTTTC 
GTGTAGCCGT 
CTGCTAATCC 
GACTCAAGAC 
ACACAGCCCA 
TGAGAAAGCG 
GTCGGAACAG 
CCTGTCGGGT 
CGGAGCCTAT 
CCTTTTGCTC 
GCCATGCATT 
GGAGTTCCGC 
CCGCCCATTG 
TTGACGTCAA 
TCATATGCCA 
TGCCCAGTAC 
CGCTATTACC 
CTCACGGGGA 
AAATCAACGG 
TAGGCGTGTA 
CTAGCGCTAC 
CCCATCCTGG 
GGCGAGGGCG 
CTGCCCGTGC 
CGCTACCCCG 
GTCCAGGAGC 
AAGTTCGAGG 
GACGGCAACA 
ATGGCCGACA 
GACGGCAGCG 
GTGCTGCTGC 
GAGAAGCGCG 
ATGGACGAGC 
GACGGTACCG 
TTGTTTGGTG 
TCTTCCTCAG 
CCGTCGTCCG 
CTCTTCAATA 
TCCAGCCCAG 
AGCAGTTCCT 
ACGAGCTCTG 
ACAGGCACTC 
TCAGAAAGCA 
TCATCTCGGC 
GGGCTTCAGG 



TAGGGTGTGG 
ATTAGTCAGC 
GCATGCATCT 
TAACTCCGCC 
CAGAGGCCGA 
GAGGCCTAGG 
GAACAAGATG 
GACTGGGCAC 
GGGCGCCCGG 
GAGGCAGCGC 
GTTGTCACTG 
CTGTCATCTC 
CTGCATACGC 
CGAGCACGTA 
CAGGGGCTCG 
GATCTCGTCG 
TTTTCTGGAT 
TTGGCTACCC 
CTTTACGGTA 
TTCTTCTGAG 
CACGAGATTT 
GGGACGCCGG 
CTAGGGGGAG 
CGGCAATAAA 
TTCGGTCCCA 
CCCGCGTTTC 
CAGCCAACGT 
TTGATTTAAA 
TCATGACCAA 
AGATCAAAGG 
AAAAACCACC 
CGAAGGTAAC 
AGTTAGGCCA 
TGTTACCAGT 
GATAGTTACC 
GCTTGGAGCG 
CCACGCTTCC 
GAGAGCGCAC 
TTCGCCACCT 
GGAAAAACGC 
ACATGTTCTT 
AGTTATTAAT 
GTTACATAAC 
ACGTCAATAA 
TGGGTGGAGT 
AGTACGCCCC 
ATGACCTTAT 
ATGGTGATGC 
TTTCCAAGTC 
GACTTTCCAA 
CGGTGGGAGG 
CGGTCGCCAC 
TCGAGCTGGA 
ATGCCACCTA 
CCTGGCCCAC 
ACCACATGAA 
GCACCATCTT 
GCGACACCCT 
TCCTGGGGCA 
AGCAGAAGAA 
TGCAGCTCGC 
CCGACAACCA 
ATCACATGGT 
TGTACAAGTC 
CGGGCCCGGG 
CCAAGGCAGG 
TAATGCCCAG 
GTACGGGCAG 
AACCCTCAGA 
CATCGGTTCA 
CTGCAGGCAG 
AGTCCATTGA 
ACGAGGTCCA 
TGCAGCTTGA 
AGGCCAACCA 
ACACTGGCAA 



AAAGTCCCCA 
AAC CAGGTGT 
CAATTAGTCA 
CAGTTCCGCC 
GGCCGCCTCG 
CTTTTGCAAA 
GATTGCACGC 
AACAGACAAT 
TTCTTTTTGT 
GGCTATCGTG 
AAGCGGGAAG 
ACCTTGCTCC 
TTGATCCGGC 
CTCGGATGGA 
CGCCAGCCGA 
TGACCCATGG 
TCATCGACTG 
GTGATATTGC 
TCGCCGCTCC 
CGGGACTCTG 
CGATTCCACC 
CTGGATGATC 
GCTAACTGAA 
AAGACAGAAT 
GGGCTGGCAC 
TTCCTTTTCC 
CGGGGCGGCA 
ACTTCATTTT 
AATCCCTTAA 
ATCTTCTTGA 
GCTACCAGCG 
TGGCTTCAGC 
CCACTTCAAG 
GGCTGCTGCC 
GGATAAGGCG 
AACGACCTAC 
CGAAGGGAGA 
GAGGGAGCTT 
CTGACTTGAG 
CAGCAACGCG 
TCCTGCGTTA 
AGTAATCAAT 
TTACGGTAAA 
TGACGTATGT 
ATTTACGGTA 
CTATTGACGT 
GGGACTTTCC 
GGTTTTGGCA 
TCCACCCCAT 
AATGTCGTAA 
TCTATATAAG 
CATGGTGAGC 
CGGCGACGTA 
CGGCAAGCTG 
CCTCGTGACC 
GCAGCACGAC 
CTTCAAGGAC 
GGTGAACCGC 
CAAGCTGGAG 
CGGCATCAAG 
CGACCACTAC 
CTACCTGAGC 
CCTGCTGGAG 
CGGACTCAGA 
ATCCAAGTAT 
TGGCAAATCT 
CCCTAGTACC 
CATGGGCAGT 
CTTAACTACA 
CTCTTTCACA 
CAAGGATACT 
CCTCCCCCTC 
GAGCCTGCTC 
CAGAAATACA 
AGAAGAGGGC 
CCAGTCACCT 



GGCTCCCCAG 
GGAAAGTCCC 
GCAACCATAG 
CATTCTCCGC 
GCCTCTGAGC 
GATCGATCAA 
AGGTTCTCCG 
CGGCTGCTCT 
CAAGACCGAC 
GCTGGCCACG 
GGACTGGCTG 
TGCCGAGAAA 
TACCTGCCCA 
AGCCGGTCTT 
ACTGTTCGCC 
CGATGCCTGC 
TGGCCGGCTG 
TGAAGAGCTT 
CGATTCGCAG 
GGGTTCGAAA 
GCCGCCTTCT 
CTCCAGCGCG 
ACACGGAAGG 
AAAACGCACG 
TCTGTCGATA 
CCACCCCACC 
GGCCCTGCCA 
TAATTTAAAA 
CGTGAGTTTT 
GATCCTTTTT 
GTGGTTTGTT 
AGAGCGCAGA 
AACTCTGTAG 
AGTGGCGATA 
CAGCGGTCGG 
ACCGAACTGA 
AAGGCGGACA 
CCAGGGGGAA 
CGTCGATTTT 
GCCTTTTTAC 
TCCCCTGATT 
TACGGGGTCA 
TGGCCCGCCT 
TCCCATAGTA 
AACTGCCCAC 
CAATGACGGT 
TACTTGGCAG 
GTACATCAAT 
TGACGTCAAT 
CAACTCCGCC 
CAGAGCTGGT 
AAGGGCGAGG 
AACGGCCACA 
ACCCTGAAGT 
ACCCTGACCT 
TTCTTCAAGT 
GACGGCAACT 
ATCGAGCTGA 
TACAACTACA 
GTGAACTTCA 
CAGCAGAACA 
ACCCAGTCCG 
TTCGTGACCG 
TCTCGAGCTC 
CCAGATATTG 
GCCTCTGCAC 
ACATTAGCGC 
GCTGGTGGGC 
GATGTTATAA 
TCAGGTGGTC 
CCGAGCTACC 
AGCCATCATG 
ATGAGAACGG 
CTACCCAAAA 
AAAGAGTGGT 
CTGGTTTCCC 



900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1660 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
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CTTCTGCCAT 
CAAATTTGTC 
CCCAAGACTC 
TGGAGGAAAG 
TTCATGGCTC 
AAAAGGCTCA 
AAGTTGCTAC 
AGAGCTTAGG 
AATCTGAACT 
CCCAGGCGGC 
TCAGAAGACA 
GTATTGG CAG 
CTAGAGGAAG 
AGCCTCCTTC 
CCAAGTTACC 
CTTCAGCGAT 
AGCTCAGAGA 
ATCATCTTGA 
AAGCTGAAAA 
CGTCAGAATC 
TAAACAATTT 
ATGCAACTGG 
ATGGTCGAGC 
GAAAAACCAA 
TCCGAATTGA 
TAGGAGACTT 
TTGTTGGAGA 
ACAGTTTTGT 
TGATGGAGCA 
CAAACAAACT 
TTGCCACTTT 
TGGCTGAACA 
ATAATCTTCA 
AGAACAAATG 
TAGAGCTGCA 
GCTTTTTAGG 
GCAATAATGA 
GTTTTTTGGA 
GCCCCATGGA 
TACCTTATAT 
GGGAAGATCC 
AGGAGAGCGC 
CCACTAAGGA 
TGATGAATAT 
AC AG CG AAAG 
AGAGGGTGAA 



GTCATCTTCT 
TCAATTTAAC 
TTCCTTCGAT 
ACCTCGTGCC 
TTCATTATCA 
TTCAGAGCAA 
CCTCACATCT 
GAATATGACT 
TATAGAACTA 
TATTCAGGGA 
GCATTCCTCT 
TGGTAATGAT 
TGAGCTGAGA 
ATCACATTCT 
CCATAATGCT 
CTGTG AATG C 
AAAGGAATTA 
TCAGATCCGG 
TGACCGGTTG 
CTCAAGCAGC 
GAACATCACA 
ACATAAAGAT 
AAAGGACCAA 
GTGGGATGTC 
TACATCCACT 
AATTAGATCC 
TAATAACATC 
TTTTGATACG 
TCACAGAATT 
TGCTGAATAT 
TAATGTGGAC 
GTGCAGTGCT 
TCATGTGGGC 
TCCATATATT 
TCACAATTTC 
CAGATATCTT 
CCTAGTCAAA 
AACACACAGT 
TGTAGAAGGT 
TCTGGAGGCA 
TTCAAAGTGG 
AGCCTTACTT 
AGCCACAACC 
GCTAATGAAA 
CACCAGCCAC 
AGCCGAAATC 



GCAGCTGGAA 
CTTCCCGGGC 
CTCTATGATG 
ATCAGTCATT 
CTGGTGTCCA 
ATCCATAAAC 
CAGCTTTCAG 
GGCCGATTGC 
AGAGAAACCA 
GCACTGAATG 
GAAAGTGTTT 
GCCGACTCCA 
AGTTCTTTCA 
GACATTGAAG 
GGTGACTGTG 
ACAGAAG CTG 
AAATTAACGG 
GAAGCCATGA 
AAGGCAGAAA 
ACCTCCTCTT 
GAGGCTGTTA 
GGCCGCAGTG 
AAATCTCAGG 
TTAGATGGTG 
AGCCTTGGTC 
CATAACCTAG 
ATCACTGTGA 
CTGATTCCTA 
ATACTCTCAG 
GTAATAACCA 
CACAAGTCAA 
GAT AAT AATG 
TCTCTGAGTG 
ATTGGAACAA 
AGGTGGGTAT 
CGAAGAAAAC 
ATTATAGATT 
TCTTCTGACG 
TCTAGAGTAT 
GTGAGAGAGG 
GTGCTTGACA 
CAGCTGCGAC 
TCAAAGCACA 
CTCCAAGAAG 
CATGAAGACA 
C AGC AC ACT G 



AATACCACTT 
CCAGCATGAT 
ACTCCCAGCT 
CGGGCTGATT 
GCACTTCTTC 
TGCGGAGAGA 
C AAATG CTCA 
AAAGTCTAAC 
TTGAAATGCT 
GTCCAGACCA 
CTAGTATCAA 
AGAAGAAGAA 
AACAAGCCTT 
AGCTTACTGA 
GCTCAGCATC 
AGGCAGAGAT 
ATATTCGGCT 
ACCGGATGCA 
CTGGTAACAC 
CATCTTCCAG 
GCTCAGATAT 
TGAAAATTAT 
CAfATTTGAT 
TAATAAGACG 
TGAGCTCTGA 
AAGTGCCTGA 
ACCTCAAAGG 
AAC CAATT AC 
GACCGAGTGG 
AATCTGGAAG 
GTAAGGAATT 
GAGTGGAGCT 
ATATCTTCAA 
TGAATCAGGG 
TATGTGCAAA 
TCATAGAGAT 
GGATTCCGAA 
TTACCATTGG 
GGTTCATGGA 
GTCTTCAGAT 
CATATCCATG 
CAGAAGATGT 
TTCCGCAAAC 
CAGCCAATTA 
TTTTGGATTC 
GCGGCCGTTA 



TTCTAACTTG 
GCGCTCAAAC 
TTGTGGGAGT 
CAGAGACAGC 
TCTTTACTCT 
GCTGGTTGCA 
CCTTGTAGCA 
TATGACAGCG 
GAAGGCTCAG 
TCCTCCCAAA 
CAGTGCCACA 
AAAGAAAAAC 
TGGGAAGAAA 
TTCATCCCTT 
CATGAAGCCC 
AATTCTGCAG 
GGAGGCCCTC 
GAATGAAATT 
AG CT AAGCCT 
GCAGTCATTA 
TTTGCTAGAT 
AGTCTCCATA 
AGGCTCCATT 
TCTCTTTAAG 
CTGCATTGCT 
ATTGCTGCCT 
GGTAGAAGAA 
CCAAAGGTAC 
TACTGGAAAG 
GAAAAAAACA 
GCAACAATAT 
CCCAGTTGTA 
TGGTTTTCTC 
AGTTTCTTCA 
TCATACAGAA 
AGAAATTGAA 
GACGTGGCAT 
TCCCCGACTA 
TCTCTGGAAC 
GTATGGGAAA 
GAGCTCAGCA 
TGGGTATGAA 
TGACACAGAA 
CTCAAGCACA 
ATCTCTTGAA 



GTGAGCCCAA 
AGCATCCCAG 
GCCACTTCTC 
ATGGAAGAAG 
ACAGCTGAAG 
TCACAAGAAA 
GCTTTTGAAA 
GAACAAAAGG 
AATTCTGCTG 
GATCTTCGCA 
AGCCATTCCA 
TGGGTGAACT 
AAGTCCACCA 
CCGGCATCCC 
TCACAATCTG 
CTGAAGAGCG 
AGCTCTGCTC 
GAAATACTGA 
ACTCGGCCAC 
GGACTTTCTC 
GATGCTGGTG 
AGCAAGGGCT 
GGTGTTAGTG 
GAATATGTAT 
AGCTACTGTA 
TGTGGATACC 
AATAGTTTGG 
TTTAACTTGT 
ACCTATTTGG 
GAGGATGCAA 
CTAGCTAACC 
ATAATTCTTG 
AATTGTAAAT 
TCACCAAATC 
CCAGTGAAAG 
AGGAACATTC 
CATCTCAACA 
TTC CTTCCTT 
TATTCTTTAG 
CGCACACCAT 
ACTCTGCCTC 
AGCTGCACAT 
GGAGATCCCC 
CAAAGCTGCG 
TCTACCCTCT 



5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8140 



// 



Leg-end: pGI3 3 03 was obtained by inserting the 3421 bp 
BamHI/Spel fragment of the Hs-Unc53/3GLd2 2__PCR2 . 1_D02 in 
a BamHI/Xbal opened pEGFPcl vector (Clontech Inc.). This 
plasmid encodes an eGFP protein in fusion with the C- 
terminal half of Hs-unc-53/3 (last 1128 AA) . Arrows 
indicate the ORFs . 
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Figure 7c: Illustration of the AA sequence of GFP : : C- 
terminal Hs-unc-53/3 fragment ( insert of pGl3303) 

MVS KGEELFTGWP I LVELDGDVNGHKFSVSGEGEGDATYGKLTLKF I CT TGKXi PVPWPTLVTTLT YGV 
OC F SRYPDHMKQHDFFKSAMPEGYVQERT I FFKDDGNYKTRAEVKFEGDTLVNRI E LKG I DFKEDGNI L 
GH KLEYNYNSHNVYIMADKQKNGIKV^ 
LSKDP NEKRDHMVLLEFVTAAGITI^MDELYK SGL 

AKAGGKSASA . P NTEGVKS SSVMPSPSTTIiAROGSXiES PSSGTGSMGSAGGLSGSSSPT.iFNKPSDLTTDV 
ISLSHSIAS SPASVHSFT SGGLWAANMSSSSAGSKOTPSYOSMTS LHTSSESIDLPLSHHGSLSGLTT 
GTHEVOSLLMRTGSWSTLSESMQLDI^ I. PKKGLRYTPSSR0ANO EEGKEWLRSHSTGGLODTGN0SP 
LVSPSAMSSSAAGKYHFSNLVSPTNLSOFNLPGPSMMRSNSIPAO DSSFDLYD PSOIjCGSATS LEERPR 
AISHSGSFRDSMEEVHGSSr.SLVSSTSS^YSTAEEKA HSEOIHKLRRELVASOEKVATLTSOLSANAHL 

v^fekslg nmtgrlo sltmtaeokeselielretiemlkaonsaaoaaic^ 

HSSESVSSINSATSHSSIGSGNDADSKKKKKKNWVN SR GSELRSSFKQA FGKKKSTKPPSSHSDIEELT 

dsslpaspklphnagdcgsasmkpsosasaicecteaeaeiilol kselreke lkltdirlea lssahh 

LDOIREAMNRMONEIEILKAENDRLKAETGNTAKPTRPPSESS SSTSSSSSROSL GLSLNNLNITEAVS 

sdillddagdatohkix3RSvktivsiskgygrakdoks oayligsigvsgktkwdvlix;virrlfkew 

FRIDTSTSI^IiS SDCIA SYCIGDLIRSHNIjEVPELLPCGYLVGDNN IITVNLKGVEENSLDSFVFDTLI 

pkpitoryfi^i^hhritlsgpsgtgktyij^iaewitksgrk ktedaiatfnvd 

ANl^EOCSADNNGVELPWXILDI^HHVGSI.SDXFNGF I.NCKYNKCPYIIGTMNOGVSSSPNLELHHNF 

rwvlc^tepvkgfi^ryi.rrklieieternirni^lvkiidwip k twhhlnsflet h 

LFLPCPMDVEGSRVWFMDLWNYSLV^^ 

llolrpedvgyesctstkeattskhipotdtegdplmnmlmki,o eaanyssto scdsestshh edilds 

SLESTL 

Legend: Single underlined AA sequence represents eGFP. 
Double underlined AA sequence represents the C-terminal 
fragment of Hs-unc-53/3 
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Figure 7d: Illustration of the plasmid map and the 
nucleotide sequence of the pGl33 05 expression vector 
(full length Hs-unc-53/3 in fusion with GFP) 



Sac II (i) 

/ 




Kan/Neo 



PGI3305 

11842 bp 



\\<5-unc-S3/3 



Pcmv 




eGFP 



Xho I (4689) 



ID pGI3305 circular DNA; 11842 BP 

FT CDS 1245. .2039 

FT /vnti f key= " 4 " 

FT /label=Kan/Neo 

FT CDS 3895.. 10983 

FT /vntif key="4 " 

FT /label=hHs-unc-53/3\ ( f ullUength) 

FT CDS 3962. ,4678 

FT /vntifkey= u 4" 

FT /label=eGFP 

FT promoter 3350.. 3938 

FT /vntifkey="29" 

FT /label= Pcmv 

SQ SEQUENCE 11842 BP; 



GGGCCCGGGA TCCACCGGAT CTAGATAACT GATCATAATC AGCCATACCA CATTTGTAGA 60 

GGTTTTACTT GCTTTAAAAA ACCTCCCACA CCTCCCCCTG AACCTGAAAC ATAAAATGAA 120 

TGCAATTGTT GTTGTTAACT TGTTTATTGC AGCTTATAAT GGTTACAAAT AAAGCAATAG 180 

CATCACAAAT TTCACAAATA AAGCATTTTT TTCACTGCAT TCTAGTTGTG GTTTGTCCAA 240 

ACTCATCAAT GTATCTTAAC GCGTAAATTG TAAGCGTTAA TATTTTGTTA AAATTCGCGT 3 00 

TAAATTTTTG TTAAATCAGC TCATTTTTTA ACCAATAGGC CGAAATCGGC AAAATCCCTT 3 60 

ATAAATCAAA AGAATAGACC GAGATAGGGT TGAGTGTTGT TCCAGTTTGG AACAAGAGTC 420 

CACTATTAAA GAACGTGGAC TCCAACGTCA AAGGGCGAAA AACCGTCTAT CAGGGCGATG 4 80 

GCCCACTACG TG AAC CATC A CCCTAATCAA GTTTTTTGGG GTCGAGGTGC CGTAAAGCAC 540 

TAAATCGGAA CCCTAAAGGG AGCCCCCGAT TTAGAGCTTG ACGGGGAAAG CCGGCGAACG 600 

TGGCGAGAAA GGAAGGGAAG AAAGCGAAAG GAGCGGGCGC TAGGGCGCTG GCAAG TGT AG 660 

CGGTCACGCT GCGCGTAACC ACCACACCCG CCGCGCTTAA TGCGCCGCTA CAGGGCGCGT 7 20 

CAGGTGGCAC TTTTCGGGGA AATGTGCGCG GAACCCCTAT TTGTTTATTT TTCTAAATAC 7 80 

ATTCAAATAT GTATCCGCTC ATGAGACAAT AACCCTGATA AATGCTTCAA TAATATTGAA 8 40 

AAAGGAAGAG TCCTGAGGCG GAAAGAACCA GCTGTGGAAT GTGTGTCAGT TAGGGTGTGG 900 

AAAGTCCCCA GGCTCCCCAG CAGGCAGAAG T ATGCAAAG C ATGCATCTCA ATTAGTCAGC 9 60 

AACCAGGTGT GGAAAGTCCC CAGGCTCCCC AGCAGGCAGA AGTATGCAAA GCATGCATCT 10 20 

CAATTAGTCA GCAACCATAG TCCCGCCCCT AACTCCGCCC ATCCCGCCCC TAACTCCGCC 1080 

CAGTTCCGCC CATTCTCCGC CCCATGGCTG ACTAATTTTT TTTATTTATG CAGAGGCCGA 1140 
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GGCCGCCTCG GCCTCTGAGC TATTCCAGAA GTAGTGAGGA GGCTTTTTTG GAGGCCTAGG 1200 

CTTTTGCAAA GATCGATCAA GAGACAGGAT GAGGATCGTT TCGCATGATT GAACAAGATG 12 60 

GATTGCACGC AGGTTCTCCG GCCGCTTGGG TGGAGAGGCT ATTCGGCTAT GACTGGGCAC 1320 

AACAGACAAT CGGCTGCTCT GATGCCGCCG TGTTCCGGCT GTCAGCGCAG GGGCGCCCGG 13 80 

TTCTTTTTGT CAAGACCGAC CTGTCCGGTG CCCTGAATGA ACTGCAAGAC GAGGCAGCGC 1440 

GGCTATCGTG GCTGGCCACG ACGGGCGTTC CTTGCGCAGC TGTGCTCGAC GTTGTCACTG 1500 

AAGCGGGAAG GGACTGGCTG CTATTGGGCG AAGTGCCGGG GCAGGATCTC CTGTCATCTC 1560 

ACCTTGCTCC TGCCGAGAAA GTATCCATCA TGGCTGATGC AATGCGGCGG CTGCATACGC 1620 

TTGATCCGGC TACCTGCCCA TTCGACCACC AAGCGAAACA TCGCATCGAG CG AG CACGTA 1680 

CTCGGATGGA AGCCGGTCTT GTCGATCAGG ATGATCTGGA CGAAGAGCAT CAGGGGCTCG 1740 

CGCCAGCCGA ACTGTTCGCC AGGCTCAAGG CGAGCATGCC CGACGGCGAG GATCTCGTCG 1800 

TGACCCATGG CGATGCCTGC TTGCCGAATA TCATGGTGGA AAATGGCCGC TTTTCTGGAT 1860 

TCATCGACTG TGGCCGGCTG GGTGTGGCGG ACCGCTATCA GGACATAGCG TTGGCTACCC 1920 

GTGATATTGC TGAAGAGCTT GGCGGCGAAT GGGCTGACCG CTTCCTCGTG CTTTACGGTA 1980 

TCGCCGCTCC CGATTCGCAG CGCATCGCCT TCTATCGCCT TCTTGACGAG TTCTTCTGAG 2040 

CGGGACTCTG GGGTTCGAAA TGACCGACCA AGCGACGCCC AACCTGCCAT CACGAGATTT 2100 

CGATTCCACC GCCGCCTTCT ATGAAAGGTT GGGCTTCGGA ATCGTTTTCC GGGACGCCGG 2160 

CTGGATGATC CTCCAGCGCG GGGATCTCAT GCTGGAGTTC TTCGCCCACC CTAGGGGGAG 2220 

GCTAACTGAA ACACGGAAGG AGACAATACC GGAAGGAACC CGCGCTATGA CGGCAATAAA 2280 

AAGACAGAAT AAAACGCACG GTGTTGGGTC GTTTGTTCAT AAACGCGGGG TTCGGTCCCA 2340 

GGGCTGGCAC TCTGTCGATA CCCCACCGAG ACCCCATTGG GGCCAATACG CCCGCGTTTC 2400 

TTC CTTTTCC CCACCCCACC CCCCAAGTTC GGGTGAAGGC CCAGGGCTCG CAGC CAA CGT 2460 

CGGGGCGGCA GGCCCTGCCA T AG CCTCAGG TTACTCATAT ATACTTTAGA TTGATTTAAA 2520 

ACTTCATTTT TAATTTAAAA GGATCTAGGT GAAGATCCTT TTTGATAATC TCATGACCAA 2580 

AATCCCTTAA CGTGAGTTTT CGTTCCACTG AGCGTCAGAC CCCGTAGAAA AG ATCAAAG G 2640 

ATCTTCTTGA GATCCTTTTT TTCTGCGCGT AATCTGCTGC TTGCAAACAA AAAAACCACC 2700 

GCTACCAGCG GTGGTTTGTT TGCCGGATCA AGAGCTACCA ACTCTTTTTC CGAAGGTAAC 276 0 

TGGCTTCAGC AGAGCGCAGA TACCAAATAC TGTCCTTCTA GTGTAGCCGT AGTTAGGCCA 2820 

CCACTTCAAG AACTCTGTAG CACCGCCTAC ATACCTCGCT, CTGCTAATCC TGTTACCAGT 2880 

GGCTGCTGCC AGTGGCGATA AGTCGTGTCT TACCGGGTTG GACTCAAGAC GATAGTTACC 2940 

GG AT AAGGCG CAGCGGTCGG GCTGAACGGG GGGTTCGTGC ACACAGCCCA GCTTGGAGCG 3 000 

AACGACCTAC ACCGAACTGA GATACCTACA GCGTGAGCTA TGAGAAAGCG CCACGCTTCC 3 050 

CGAAGGGAGA AAGGCGGACA GGTATCCGGT AAGCGGCAGG GTCGGAACAG GAGAGCGCAC 3120 

GAGGGAGCTT CCAGGGGGAA ACGCCTGGTA TCTTTATAGT CCTGTCGGGT TTCGCCACCT 3180 

CTGACTTGAG CGTCGATTTT TGTGATGCTC GTCAGGGGGG CGGAGCCTAT GGAAAAACGC 3 240 

CAGCAACGCG GCCTTTTTAC GGTTCCTGGC CTTTTGCTGG CCTTTTGCTC ACATGTTCTT 3300 

TCCTGCGTTA TCCCCTGATT CTGTGGATAA CCGTATTACC GCCATGCATT AGTTATTAAT 3360 

AGTAATCAAT TACGGGGTCA TTAGTTCATA GCCCATATAT GGAGTTCCGC GTTACATAAC 3420 

TTACGGTAAA TGGCCCGCCT GGCTGACCGC CCAACGACCC CCGCCCATTG ACGTCAATAA 3 480 

TGACGTATGT TCCCATAGTA ACG CCAATAG GGACTTTCCA TTGACGTCAA TGGGTGGAGT 3 540 

ATTTACGGTA AACTGCCCAC TTGGCAGTAC ATCAAGTGTA TCATATGCCA AGTACGCCCC 3600 

CTATTGACGT CAATGACGGT AAATGGCCCG CCTGG C ATT A TGCCCAGTAC ATGACCTTAT 3 660 

GGGACTTTCC T ACTTGG C AG TACATCTACG TATTAGTCAT CGCTATTACC ATGGTGATGC 37 20 

GGTTTTGGCA GTACATCAAT GGGCGTGGAT AGCGGTTTGA CTCACGGGGA TTTCC AAGT C 3780 

TCCACCCCAT TGACGTCAAT GGGAGTTTGT TTTGGCACCA AAATCAACGG GACTTTCCAA 3840 

AATGTCGTAA CAACTCCGCC CCATTGACGC AAATGGGCGG TAGGCGTGTA CGGTGGGAGG 3900 

TCTATATAAG CAGAGCTGGT TTAGTGAACC GTCAGATCCG CTAGCGCTAC CGGTCGCCAC 3960 

CATGGTGAGC AAGGGCGAGG AGCTGTTCAC CGGGGTGGTG CCCATCCTGG TCGAGCTGGA 4020 

CGGCGACGTA AACGGCCACA AGTTCAGCGT GTCCGGCGAG GGCGAGGGCG ATGCCACCTA 4080 

CGGCAAGCTG ACCCTGAAGT TCATCTGGAC CACCGGCAAG CTGCCCGTGC CCTGGCCCAC 4140 

CCTCGTGACC ACCCTGACCT ACGGCGTGCA GTGCTTCAGC CGCTACCCCG ACCACATGAA 4200 

GCAGCACGAC TTCTTCAAGT CCGCCATGCC CGAAGGCTAC GTCCAGGAGC GCACCATCTT 4260 

CTTCAAGGAC GACGGCAACT ACAAGACCCG CGCCGAGGTG AAGTTCGAGG GCGACACCCT 4320 

GGTGAACCGC ATCGAGCTGA AGGGCATCGA CTTCAAGGAG GACGGCAACA TCCTGGGGCA 4380 

CAAGCTGGAG TACAACTACA ACAGCCACAA CGTCTATATC ATGGCCGACA AGCAGAAGAA 4440 

CGGCATCAAG GTGAACTTCA AGATCCGCCA CAACATCGAG GACGGCAGCG TGCAGCTCGC 4500 

CGACCACTAC CAGCAGAACA CCCGCATCGG CGACGGCCCC GTGCTGCTGC CCGACAACCA 4560 

CTACCTGAGC ACCCAGTCCG CCCTGAGCAA AGACCCCAAC GAGAAGCGCG ATCACATGGT 4 620 

CCTGCTGGAG TTCGTGACCG CCGCCGGGAT CACTCTCGGC ATGGACGAGC TGTACAAGTA 4 680 

CTCAGATCTC GAGCATATGC CTGTTCTTGG GGTTGCCTCA AAACTGAGGC AGCCAGCTGT 4740 

TGGGTCAAAG CCTGTGCATA CTGCTCTTCC GATACCAAAT CTTGGCACTA CTGGGTCACA 4 800 

GCACTGTTCT TCAAGACCTT TGGAACTTGC TGAAACAGAG AGCTCCATGC TTTCTTGTCA 4 8 60 

GCTTGCGTTA AAATCAACCT GTGAATTTGG AGAGAAGAAA CCCCTCCAAG GAAAAGCCAA 4 920 

GGAGAAAGAA GACAGCAAGA TTTACACTGA CTGGGCCAAC CACTACCTAG CAAAATCAGG 4 980 

. CCACAAGCGG CTGATCAAGG ACTTGCAACA AGACATTGCA GATGGAGTAC TC CT AG C AG A 5040 

AATCATCCAG ATTATTGCAA ATGAAAAAGT TGAAGATATC AATGGATGTC CTAGAAGTCA 5100 

GTCTCAGATG ATTGAAAATG TTGATGTCTG CCTTAGTTTT CTAGCAGCCA GAGGGGTAAA 5160 

TGTTCAAGGT CTATCTGCTG AAGAAATAAG AAATGGAAAC TTAAAAGCCA TTCTAGGGCT 5220 

GTTTTTCAGT TTATCTCGCT ACAAGCAGCA ACAACACCAT CAACAACAGT ACTATCAGTC 52 80 

CTTGGTGGAA CTTCAGCAGC GAGTTACTCA CGCTTCCCCT CCATCGGAAG CCAGCCAGGC 5340 

CAAAACCCAG CAAGATATGC AGTCCAGTCT GGCAGCCAGA TATGCAACTC AGTCTAATCA 5400 

CAGTGGAATT GCAACCAGTC AAAAAAAGCC TACT AG G CTT CCAGGGCCCT CTAGGGTGCC 54 60 
TGCTGCAGGA AGCAGCAGCA AGGTCCAGGG AGCCTCTAAT TTAAATAGGA GAAGTCAGAG 55 20 

CTTTAACAGC ATTGACAAAA ACAAGCCTCC AAATTATGCA AATGGAAACG AAAAAGATTC 5580 
CTCCAAAGGA CCTCAA7CGT CTTCAGGTGT AAATGGTAAC GTGCAGCCTC CCAGTACTGC 5640 
TGGGCAGCCT CCTGCC7CTG CCATCCCTTC TCCAAGTGCC AGCAAGCCCT GGCGCAGCAA 57 00 
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GTCCATGAAT GTCAAACACA GTGCCACCTC C AC CATGTTG ACTGTAAAGC AGTCAAGTAC 57 60 

AGCCACCTCC CCCACACCAT CTTCAGACAG ACTGAAGCCA CCTGTCTCAG AAGGGGTCAA 5820 

AACTGCTCCC TCAGGACAGA AATCCATGCT TGAGAAATTC AAGCTAGTCA ATGCCCGGAC 5880 

TGCTTTACGC CCCCCGCAGC CTCCCAGTTC AGGACCTAGT GATGGTGGGA AGGATGATGA 5940 

TGCCTTTTCT GAATCTGGTG AAATGGAAGG TTTTAACAGT GGTCTGAATA GTGGTGGCTC 6000 

AACAAATAGC AGTCCCAAAG TGTCACCTAA GTTGGCCCCT CCAAAAGCTG GAAGCAAAAA 6060 

TCTCAGCAAT AAAAAGTCTT TGCTACAGCC AAAGGAAAAA GAAGAAAAGA ACAGGGACAA 6120 

AAATAAAGTT TGCACTGAAA AACCAGTCAA AGAAGAGAAG GATCAGGTGA CAGAGATGGC 6180 

TCCAAAAAAG ACCTCCAAAA TTGCAAGCTT GATCCCTAAG GGCAGCAAGA CAACAGCAGC 6240 

TAAGAAGGAA AGCTTAATTC CGTCTTCCAG TGGTATTCCA AAACCAGGCT CTAAAGTTCC 6300 

AACAGTAAAG CAAACCATTT CACCTGGCAG CACAGCAAGC AAAGAGTCTG AGAAATTCAG 6360 

GACTACCAAG GGGAGCCCTT CCCAGTCCTT ATCTAAGCCT ATAACCATGG AGAAAGCAAG 6420 

TGCTTCTAGT TGTCCTGCCC CTTTGGAAGG AAGGGAAGCT GGCCAAGCTT CTCCTTCTGG 6480 

TTCCTGTACC ATGACAGTGG CACAAAGCAG TGGGCAGAGC ACAGGAAATG GTGCTGTCCA 6540 

ACTCCCTCAA CAGCAGCAAC ATAGCCACCC GAATACCGCG ACAGTGGCAC CATTCATTTA 6600 

CAGGGCACAT TCAGAAAATG AAGGTACCGC TTTACCATCG GCTGACTCCT GTACCAGTCC 6660 

TACAAAGATG GACTTATCAT ATAGTAAGAC TGCTAAGCAG TGCCTGGAGG AGATATCTGG 6720 

TGAAGACCCT GAAACAAGAA GAATGAGAAC AGTTAAAAAC ATAGCAGACT TGAGGCAGAA 6780 

TTTAGAAGAG ACTATGTCCA GTCTTCGTGG GACTCAGATA AGCCACAGCA CCCTGGAGAC 6840 

AACATTTGAC AGCACTGTGA CAACAGAAGT TAATGGAAGG ACCATACCCA ACTTGACAAG 6900 

TCGACCCACC CCCATGACCT GGAGGTTGGG CCAGGCATGT CCGCGACTTC AGGCGGGAGA 6960 

TGCTCCCTCC CTGGGTGCTG GCTATCCTCG CAGTGGTACC AGTCGATTCA TCCACACAGA 7 02 0 

CCCCTCAAGG TTCATGTATA CCACGCCTCT CCGTCGAGCT GCTGTCTCTA GGCTGGGAAA 7080 

CATGTCACAG ATTGACATGA GTGAGAAAGC AAGCAGTGAC CTGGACATGT CTTCTGAGGT 7140 

CGATGTGGGT GGATATATGA GTGATGGTGA TATCCTTGGG AAAAGTCTCA GGACTGATGA 7200 

CATCAACAGT GGGTACATGA CAGATGGAGG ACTTAACCTA TATACTAGAA GTCTGAACCG 7260 

AATACCAGAC ACAG CAACTT CCCGGGACAT CATCCAGAGA GGGGTTCACG ATGTGACAGT 7320 

GGATGCAGAC AGCTGGGATG ACAG C AGTTC AGTGAGCAGT GGTCTCAGTG ACACCCTTGA 7380 

TAACATCAGC ACTGATGACC TGAACACCAC ATCCTCTGTC AGCTCTTACT CCAACATCAC 7 440 

CGTCCCCTCT AGGAAGAATA CTCAGCTGAG GACAGATTCA GAGAAACGCT CCACCACAGA 7 500 

CGAGACCTGG GATAGTCCTG AGGAACTGAA AAAACCAGAA NAAGATTTTG ACAGCCATGG 7 560 

GGATGCTGGT GGCAAGTGGA AGACTGTGTC CTCTGGACTT CCTGAAGACC CCGAGAAGGC 7 620 

AGGGCAGAAA GCTTCCCTGT CTGTTTCACA GACAGGTTCC TGGAGAAGAG GCATGTCTGC 7 680 

CCAAGGAGGG GCGCCATCTA GGCAGAAAGC TGGAACAAGT GCACTCAAAA CACCCGGGAA 7740 

AACCGATGAT GCCAAAGCTT CTGAGAAAGG AAAAGCTCCC CTAAAAGGAT CATCTCTACA 7 800 

AAGATCTCCT TCAGATGCAG GAAAAAGCAG TGGAGATGAA GGGAAAAAGC CCCCCTCAGG 7860 

CATTGGAAGA TCGACTGCCA CCAGCTCCTT TGGCTTTAAG AAACCAAGTG GAGTAGGGTC 7 920 

ATCTGCCATG ATCACCAGCA GTGGAGCAAC CATAACAAGT GGCTCTGCAA CACTGGGTAA 7980 

AATTCCAAAA TCTGCTGCCA TTGGCGGGAA GTCAAATGCA GGGAGAAAAA CCAGTTTGGA 8040 

CGGTTCACAG AATCAGGATG ATGTTGTGCT GCATGTTAGC TCAAAGACTA CCCTACAATA 8100 

TCGCAGCTTG CCCCGCCCTT CAAAATCCAG CACCAGTGGC ATTCCTGGAC GAGGAGGCCA 8160 

CAGATCCAGT AC CAGC AG T A TTGATTCCAA CGTCAGCAGC AAGTCTGCTG GGGCCACCAC 8220 

CTCGAAACTG AGAGAACCAA CTAAAATTGG GTCAGGGCGC TCAAGTCCTG TCACCGTCAA 8280 

CCAAACAGAC AAGGAAAAGG AAAAAG TAG C AGTCTCAGAT TCAGAAAGTG TTTCTTTGTC 83 4 0 

AGGTTCCCCC AAATCCAGCC CCACCTCTGC CAGCGCCTGT GGTGCACAAG GTCTCAGGCA 8400 

GCCAGGATCC AAGTATCCAG ATATTGCCTC ACCCACATTT CGAAGGTTGT TTGGTGCCAA 8460 

GGCAGGTGGC AAATCTGCCT CTGCACCTAA TACTGAGGGT GTGAAATCTT CCTCAGTAAT 8520 

GCCCAGCCCT AGTACCACAT TAGCGCGGCA AGGCAGTCTG GAGTCACCGT CGTCCGGTAC 858 0 

GGGCAGCATG GGCAGTGCTG GTGGGCTAAG CGGCAGCAGC AGCCCTCTCT TCAATAAACC 864 0 

CTCAGACTTA ACTACAGATG TTATAAGCTT AAGTCACTCG TTGGCCTCCA GCCCAGCATC 8700 

GGTTCACTCT TTCACATCAG GTGGTCTCGT GTGGGCTGCC AATATGAGCA GTTCCTCTGC 8760 

AGGCAGCAAG GATACTCCGA GCTACCAGTC CATGACTAGC CTCCACACGA GCTCTGAGTC 8 820 

CATTGACCTC CCCCTCAGCC ATCATGGCTC CTTGTCTGGA CTGACCACAG GCACTCACGA 8 880 

GGTCCAGAGC CTGCTCATGA GAACGGGTAG TGTGAGATCT ACTCTCTCAG AAAGCATGCA 8940 

GCTTGACAGA AATACACTAC CC AAAAAG GG ACTAAG AT AT ACCCCATCAT CTCGGCAGGC 9000 

CAACCAAGAA GAGGGCAAAG AGTGGTTGCG TTCTCATTCT ACTGGAGGGC TTCAGGACAC 9 060 

TGGCAACCAG TCACCTCTGG TTTCCCCTTC TGCCATGTCA TCTTCTGCAG CTGGAAAATA 9120 

CCACTTTTCT AACTTGGTGA GCCCAACAAA TTTGTCTCAA TTTAACCTTC CCGGGCCCAG 9180 

CATGATGCGC TCAAACAGCA TCCCAGCCCA AGACTCTTCC TTCGATCTCT ATGATGACTC 9240 

CCAGCTTTGT GGGAGTGCCA CTTCTCTGGA GGAAAGACCT CGTGCCATCA GTCATTCGGG 93 00 

CTCATTCAGA GACAGCATGG AAGAAGTTCA TGGCTCTTCA TTATCACTGG TGTCCAGCAC 93 60 

TTCTTCTCTT TACTCTACAG CTGAAGAAAA GGCTCATTCA GAG C AAATC C ATAAACTGCG 9420 

GAGAGAGCTG GTTGCATCAC AAGAAAAAGT TGCTACCCTC ACATCTCAGC TTTCAGCAAA 94 80 

TGCTCACCTT GTAGCAGCTT TTG AAAAG AG CTTAGGGAAT ATGACTGGCC GATTGCAAAG 9 540 

TCTAACTATG ACAGCGGAAC AAAAGGAATC TGAACTTATA GAACTAAGAG AAACCATTGA 9600 

AATGCTGAAG GCTCAGAATT CTGCTGCCCA GGCGGCTATT CAGGGAGCAC TGAATGGTCC 9660 

AGACCATCCT CCCAAAGATC TTCGCATCAG AAGACAGCAT TCCTCTGAAA GTGTTTCTAG 97 20 

TATCAACAGT GCCACAAGCC ATTCCAGTAT TGGCAGTGGT AATGATGCCG ACTCCAAGAA 97 80 

GAAG AAAAAG AAAAACTGGG TGAACTCTAG AG G AAGTG AG CTGAGAAGTT CTTTCAAACA 9840 

AG CCTTTGGG AAGAAAAAGT CCACCAAGCC TCCTTCATCA CATTCTGACA TTGAAGAGCT 9900 

TACTGATTCA TCCCTTCCGG CATCCCCCAA GTTACCCCAT AATGCTGGTG ACTGTGGCTC 9960 

AGCATCCATG AAGCCCTCAC AATCTGCTTC AGCGATCTGT GAATGCACAG AAGCTG AG G C 100 20 

AGAGATAATT CTGCAGCTGA AG AG CG AG CT C AG AG AAAAG GAATTAAAAT TAACGGATAT 10080 

TCGGCTGGAG GCCCTCAGCT CTGCTCATCA TCTTGATCAG ATCCGGGAAG CCATGAACCG 10140 

- GATGCAGAAT GAAATTGAAA TACTGAAAGC TGAAAATGAC CGGTTGAAGG C AG AAACTG G 10200 

TAACACACCT AAGCCTACTC GGCCACCGTC AGAATCCTCA AGCAGCACCT CCTCTTCATC 10260 
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TTCCAGGCAG 
AGATATTTTG 
AATTATAGTC 
TTTGATAGGC 
AAGACGTCTC 
CTCTGACTGC 
GCCTGAATTG 
CAAAGGGGTA 
AATTACCCAA 
GAGTGGTACT 
TGGAAGGAAA 
GGAATTGCAA 
GGAGCTCCCA 
CTTCAATGGT 
TCAGGGAGTT 
TGCAAATCAT 
AGAGATAGAA 
TCCGAAGACG 
CATTGGTCCC 
CATGGATCTC 
TCAGATGTAT 
TCCATGGAGC 
AGATGTTGGG 
GCAAACTGAC 
CAATTACTCA 
GGATTCATCT 
' CCGTTACTAG 



TCATTAGGAC 
CTAGATGATG 
TCCATAAGCA 
TCCATTGGTG 
TTTAAGGAAT 
ATTGCTAGCT 
CTGCCTTGTG 
GAAGAAAATA 
AGGTACTTTA 
GGAAAGACCT 
AAAACAGAGG 
CAATATCTAG 
GTTGTAATAA 
TTTCTCAATT 
TCTTCATCAC 
ACAGAACCAG 
ATTGAAAGGA 
TGGCATCATC 
CGACTATTCC 
TGGAACTATT 
GGGAAACGCA 
TCAGCAACTC 
TATGAAAGCT 
ACAGAAGGAG 
AGCACACAAA 
CTTGAATCTA 
TGGATCGGCC 



TTTCTCTAAA 
CTGGTGATGC 
AGGGCTATGG 
TTAGTGGAAA 
ATGTATTCCG 
ACTGTATAGG 
GATACCTTGT 
GTTTGGACAG 
ACTTGTTGAT 
ATTTGGCAAA 
ATGCAATTGC 
CTAACCTGGC 
TTCTTGATAA 
GTAAATACAA 
CAAATCTAGA 
TGAAAGGCTT 
ACATTCGCAA 
TCAACAGTTT 
TTCCTTGCCC 
CTTTAGTACC 
CACCATGGGA 
TGCCTCAGGA 
GCACATCCAC 
ATCCCCTGAT 
GCTGCGACAG 
CCCTCTAGAG 
GC 



CAATTTGAAC 
AACTGGACAT 
TCGAGCAAAG 
AACCAAGTGG 
AATTGATACA 
AGACTTAATT 
TGGAGATAAT 
TTTTGTTTTT 
GGAGCATCAC 
CAAACTTGCT 
CACTTTTAAT 
TGAACAGTGC 
TCTTCATCAT 
CAAATGTCCA 
GCTGCATCAC 
TTTAGGCAGA 
TAATGACCTA 
TTTGGAAACA 
CATGGATGTA 
TTATATTCTG 
AGATCCTTCA 
GAGCCCAGCC 
TAAGGAAGCC 
GAATATGCTA 
CGAAAGCACC 
GGTGAAAGCC 



ATCACAGAGG 
AAAGATGGCC 
GACCAAAAAT 
GATGTCTTAG 
TCCACTAGCC 
AGATCCCATA 
AACATCATCA 
GATACGCTGA 
AGAATTATAC 
GAATATGTAA 
GTGGACCACA 
AGTG CTG AT A 
GTGGGCTCTC 
TATATTATTG 
AATTTCAGGT 
TATCTTCGAA 
GTCAAAATTA 
CACAGTTCTT 
GAAGGTTCTA 
GAGGCAGTGA 
AAGTGGGTGC 
TTACTTCAGC 
ACAACCTCAA 
ATGAAACTCC 
AGCCACCATG 
GAAATCCAGC 



CTGTTAGCTC 
GCAGTGTGAA 
CTCAGGCATA 
ATGGTGTAAT 
TTGGTCTGAG 
ACCTAGAAGT 
CTGTGAACCT 
TTCCTAAACC 
TCTCAGGACC 
TAACCAAATC 
AGTCAAGTAA 
ATAATGGAGT 
TGAGTGATAT 
GAACAATGAA 
GGGTATTATG 
GAAAACTCAT 
TAGATTGGAT 
CTGACGTTAC 
GAGTATGGTT 
GAGAGGGTCT 
TTGACACATA 
TGCGACCAGA 
AGCACATTCC 
AAGAAGCAGC 
AAGACATTTT 
ACACTGGCGG 



10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11842 



// 



Leg-end: pGI3305 was obtained by inserting a 7148 bp 
XhoI/SacII fragment of the Hs-unc-53 /3AlLd22 clone in a 
XhoI/SacII opened pEGFPc3 vector (Clontech Inc.). This 
plasmid encodes an eGFP protein in fusion with the full 
length Hs-unc-53/3 (2363 AA) . Arrows indicate the ORFs . 
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Figure 7e: Illustration of the AA sequence of GFP::Hs- 
unc-53/3 (insert of pGl3305) 




GL VWAANMSSSSAGSKDTgSYOSn'raiitiTa ae.aititiruJ"""^"-'""' 

ESMOLDRNTI.PKKflT.RYTPSSRnANOEEGKE MT.BSHSTGGL QiyrGNOSPLVS PgAM.qsSAAGKYHFSNL 

ygPT MT.finPWTiPGPSMMRSN fiTPAODSSFDTiYDDSOIiCGS A TSLEERPPAISHSGSFRDSMEEVHGSSL 

SLVSSTSSLY .^TAF.EKAHSE OTHKLRRELVASnKKVA TLTS OLSANAHLVAA F^KST^NMTGRLOSLTM 

TAEOKESELTELRETIEMT.KAONSAAOAATOGALNGPP HPPKDLRIRROHSSESVSSINSATSHSSIGS 

GNDADSKKKK KKNWVNSRGS F.T.RSSFKO AF GXKKSTKPPSSH SDIEELTDSST.P&.^PTCT.PHNAGDCGSA 

-.. -^^.^r-rTT or voc-r.ppvPT.TfT.TnTBT.F.AT.SSAHHLDOIREAMNRMONEIEILKA 




ENDRLKAETGNTAKPTKt j yat,aa^araaaaJi\uci^^-j""""'^ 

KIIVSISKGYGRAKDOKSQAYT.TGSIGVSGKTKWDVLD GVIRRLFKEYVFRIDTSTST.GT.SSDCIASYC 

IGDLIRSHNLEVPELLPCGYLVGPNNTITVNLKGVEENSLDSFVFDTLIPKP ITORYFNLLMEHHRIIL 

SGPSGTGKTYI^KLAEYVITKSGRKK'TFnATATFNVPHKSSKELO OYLANLAEOgSADMNGVELPVVI 

ILDNLHHVGSLSDIFNGFLNCKYNKCPYTTGTMNOGVSSSPNLELHHN FRV^CANHTF.PVKGFLGRYL 

RRKLIEIEIE RNIRNNDLV KTIDWIPKTWHHT.NSFLE THS SSDVTIGPRLF T.PCPMPVEGSRVWFMDLW 

NYSLVPYILF.AVREGLOMYGKRTPWEDPSKWVT.DTYPWSSATLPOES PALLOLRPEDVGYF.SCTSTKEA 

TTSKHIPOTDTEGDPLMNMLMKLOEAANYSSTOSCDSESTS HHEDILDSSLESTL 

Legend: Single underlined AA sequence represents eGFP. 
Double underlined AA sequence represents full length Hs- 
unc-53/3 . 
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FIG. 8 Illustration of the filopodia and laiuellipodia 
outgrowth of N4 mouse neuroblastoma cells trans fee ted 
with pGI3303 . 

A: 




C: 



Legend: Fluorescence images of N4 cells transfected with 
pEGFP (A) compared to pGl3 3 03 transfected cells (B and 
C) . A: control (pEGFP) transfected cells. B: Illustration 
of filopodia outgrowth (arrowhead). C: Illustration of 
lamellipodia outgrowth (arrowhead) . Notice the actin 
sheets at the edge of the cells. 



SUBSTITUTE SHEET (RULE 26) 



WO 99763080 



PCT/EP99/03848 



FIG. 9 Illustration of the co-localization of the GFP- 
Hs-unc-53/3 fusion protein with micro tubules in N4 mouse 
neuroblastoma cells transfected with OGI3305 



A: 




B: 






Legend: Fluorescence images of N4 cells transfected with 
pEGFP (A) compared to pGI3305 transfected cells (B and 
C) . A: control transfected cells. B: Illustration of co- 
localization of Hs-unc-53/3 with microtubuli - Notice the 
centrosome in the right picture (arrowhead) and enhanced 
filopodia outgrowth in the left picture (arrowhead). C: 
Illustration of the co-localization of Hs~unc-53/3 
with(+)-end of microtubules (arrowhead). 
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Figure 11a: Illustration of the homology between Hs-unc- 
53/3 and a gene encoded (partially) by the Drosophila 
melanogaster BAC clone BACR48M05 (AC005719). Results of a 
TBLASTN search on the non-redundant database with Hs-unc- 
53/3 as query. 

Query: Hs-unc-53/3 (direct) 2120aa Length 2119 from:l to = 2119 

Sbjct: g b| AC005719 (AC005719 Drosophila melanogaster, chromosome 2R, region 

38A5-38B4, BAC clone 

BACR48M05, complete sequence [Drosophila melanogaster] 
Length = 188357 

Score = 64.0 bits (153), Expect = 4e-08 
Identities = 28/58 (48%), Positives = 41/58 (70%) 

Query: 1 IYTDWANHYIJUCSGHKRLIKDLQQDIATCVLLAEIIQIIANEICVEDINGCPRSQSQMI 58 

IYTDWAN+YL KR + DL D DG+LLAE+I* + * KV D+ P++Q QM+ 

Sbjct: 84874 I YTDWANYYLERAKSKRKVTDLSADCRDGLLLAEVI EAVT S FKVPDLVKKPKNQQQMV 84701 

Score =39.9 bits (91), Expect = 0.77 

Identities = 22/55 (40%), Positives = 34/55 (61%) 

Query: 48 NGCPRSQSQMIENVDVCLSFLAARGVN-VQGLSAEEIRNGNLKAILGLFFSLSRYK 102 

N C Q +NV+ CL L ++ V «•+ + + +1 G LKA+L LFF+LSR+K 
Sbjct: 55621 NSCSLFQ---FDNVNSCLHVXRSQSVGGLENITTNDICAGRLKAVLALFFALSRFK 55463 

Score = 35.2 bits (79), Expect = 3.8 

Identities = 31/72 (43%), Positives = 45/72 (62%) 

Query: 1266 LEERPRAISHSGSFRDSMEEVHGSSLSLVSSTSSLYSTAEEKAHSEQIHKLRRELVASQE 1325 

L+ R + HS S VHGS SL+S SSLY AEE+ + +1 +L+REL +++ 

Sbjct: 13387 LKSRLMQLCHSVSV SVHGSAASLLSGGSSLYGNAEER-QAHEIRRLKRELQDARD 13226 

Query: 132 6 KVATLTSQLSAN 13 37 

+V +L+SQLS N 
Sbjct: 13225 QVLSLSSQLSTN 13190 
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Figure lib: Illustration of an ORF encoded by the 
Drosophila melanogaster BAC clone BACR48M05 (AC005719) as 
prediction by the computer program Fgene. 

Output file for REVERSE STRAND of FGene 
F469BE1C 

length of sequence - 1883 57 

number of predicted exons - 21 



positions 
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Amino acid sequence - 1788aa 

MDSGICYIKPEYLVTEADGGSAAANTENSDTNKRKREIX5GEVEAGEKKKWDKKERKRGQN 
KNRPWKDERYSHLCHSLIDGTGGEPCSLANCRYV^ 

YCARGVSCRFAKAHTDEQGRNLKREDYDENAPPTTCNGVS S AAS STLHNASMQMNPLTNM 
KNVLKLSEHELQHGGKXSWHDMYKDSAWIFVAGFPYTLTEGDLVCVFSQYGEVVNINLIR 
DSKTGKSKHSPLYRGEILFRIPELSQIPDPLCFLCNSIIOiNSEVIiNPANFPMDIGIPNPY 
TNEQLVNAKLEQQNLEKLFNELENTASMSNSQESKDTETTSTALVESSTSTNSASSAGSC 
SIx^PAQQSMKKXLTFLNLSPFRSGKKSIDK^ 

KDQRSHTVPTESNYVLFNPGPVPSRHVQYKIRKPRPLSTHSDADSGFLSPCSPEEMRANP 
AILVLQQCDSVQGYMEIYTDWANYYLERAKSKRKVTDLSADCRTCLLLAEVIEAVTSFKV 
PDLVKKPKNQQQMFDNVNSCLHVLRSQSVGGLENITTNDICAGRLKAVLAIjFFALSRFKQ 
QAKQTKSIGVGCGGGVGGSSSTLTGSGSVLGIGIGGLRTPGSSLNQDKNQQEQQQQQQQQ 
QTPQQLAQSLENGNEMVNRQ I A PAYAKVNGGTAI PLP ATVMVQRRC P PDKVRPL P PTPNH 
TPSIPGLGKSGSDFNTSRPNSPPTSNHTIQSLKSGNNNSLRPPSIKSGIPSPSSPQTAPQ 
KHSMLDKLKLFNKEKQQNAVNAASVASKTQIQSKRTSSSSGFSSARSERSDSSLSLNDGH 
GSQLKPPSISVSSQKPQPKTKQSKLLAAQQKKEQANKATKLDKKEKSPARSLNKEESGNE 
SRSSTMGRTGKSSLVRAVGGVEKNTPKTSSKSSLHSKSDSKSSLKAPQLLQSPSSGGLPK 
PIAAIKGTSKLPSLGGGAGHLPAAESQQNQQLLKRETSDISSNISQPPPAEPPISTHAHI 
HQNQTPPPPYYANSQPTSHISSHGFLSEPSTPQHSSGIYGSSRLPPPKSALSAPRKLEYN 
AGPHILSSPTHHQRQGLPRPLVNSAPNTPTASPNKFHTIPSKIVGTIYESKEEQLLPAPP 
PASGGSSILPMRPLLRGYNSHVTLPTRGARGGHHPHQSYLDFCESDIGQGYCSDGDALRV 
GSSPGGSRFHDIDNGYLSEGSSGLNGPSSSAGGISPGKHFLSMMRARTQLPTTIEERQLI 
YGASVPILTLLPDRKIYQNNVRQIKVDKLAAMAERWNMELGNGGAKMDGSPHHRPGSRNG 
RDNWSKMPEPLNGQKVEKSDKSSPSRRSMGGGGSGSSSKQGSPSSSSRTKGVPPSFGYVK 
RANGSIASTAEQQNIA>!MMAAGGAGANGLPCGRTAHVSAVPRTASGRKVAGGTQTLPNDM 
NKLPPNTQHRSFSLTGPTATQLSQSIRERLATGSHSLPKPGSDLHVFQHRISNRGGTRHD 
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GSLSDTQTYAEVKPEYSSYAMWLKHSNTAGSRLSDGESVEQLQIGSPALTRHGHKMIHNR 
SGGPGQMAGQMSGNESPYVQSPRMNRSNSIRSTKSEKMYPSMMSRAGEVEIEPYYCLPVG 
TNGVLTAQMAAAMAAQS QAAQGNPGVGVNVGGVAWSQ PT S PT PLTRGPFNTAAGAS VLS P 
THGTTSAAGLVGPGGGAGGGAMVGHRLTYPKKNDEVHGSAASLLSGGSSLYGNAEERQAH 
EIRRLKRELQDARDQVLSLSSQLSTNVSKKC PVWFQMYTLRMARSRR* 
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Figure 11c: Illustration of a ' BLAST 2 sequences' search 
result with Hs-unc-53/3 as query and the Fgene predicted 
UNC53 homology ORF of Drosophila melanogaster BAC clone 
BACR48M05 as subject 

Query: Hs-unc-53/3 (direct) 2120aa Length 2119 from:l to = 2119 
Subject: drosUNC53 (Fgene^prediction) Length 1788 from:l to = 
1788 

Score = 106 bits (261), Expect « 2e-21 

Identities = 190/840 (22%), Positives = 294/840 (34%), Gaps = 185/840 
(22%) 

Query: 1 IYTDWANHYIJUCSGHKRLIKDLQQDIADGVLLAEIIQIIANEKVEDINGCPRSQSQMIEN 60 

IYTDWAN+YL * + KR + DL D DG+LLAE+I+ f + KV D+ P++Q QM +N 
Sbjct: 497 IYTDWANYYLERAXSKRKVTDLSADCRDGLLLAEVIEAVTSF^ 556 

Query: 61 VD VC L S F LAARG V - NVQGL S AEE I RNGNLKAI LGLFFS L S RYK 102 

V+ CL L V + + + + +1 G LKA+L LFF+LSR+K 
Sbjct: 557 VN S C LHVL R S QSVGG LENI TTND I C AG RLKAVLAL FF AL SRFKQQ AXQTKS IG VG C GG G V 616 

Query: 103 XXXXXXXXXXXSLVEL QQRVTHASPPSEASQAKTQQDMQSSLAARYATQSNHSG 156 

S++ + R +S •»■ +Q + QQ Q + QS +G 

Sbjct : 617 GGSSSTLTGSGSVLGIGIGGLRTPGSSIJJQDKNQQEQQQQQQQQQTPQQLAQSLENGNEM 67 6 

Query: 157 IATSQKK PTRLPGPSRV PAAGSSSKVQGASNLNRRSQSFNS 197 

IA + K T+PP+V P + + L + FN+ 

Sbjct: 677 VNRQIAPAYAKVNGGTAIPLPATVMVQRRCPPDKVRPLPPTPNHTPSIPGLGKSGSDFNT 73 6 

Query: 198 IDKNKPPNYANGNEKDSSKGPQS-SSGVNGNVQPPSTAGQXXXXXXXXXXXXKPWRSKSM 256 

N PP S+ QS SG N +++PPS 
Sbjct: 737 SRPNSPPT SNHTIQSLKSGNNNSLRPPSIKSGI 769 

Query: 257 NVKHSATSTMLTVKQXXXXXXXXXXXDRLKPPVSEGVKTAPSGQKSMLEKFKLVNARTAL 316 

P +TAP + SML+K XL N 

Sbjct: 770 PSPSSPQTAPQ-KHSMLDKLKLFNKEKQQ 797 

Query: 317 RXXXXXXXXXXXXXXXXXAFSESGEMEGFXXXXXXXXXXXXXPKVSPKLAPPKAGSKNLS 37 6 

S SG >L PP S ++S 
Sbjct: 798 NAVNAASVASKTQ I QSKRTS S S SGFS S — ARSERSDSSLSLNDGHGSQLKPP SISVS 852 

Query: 377 NKKSLLQPXXXXXXNRDKNKVCTEKPVKEEKDQVTEMAPKKTSKIASLIPKGSKTTAAKK 436 

++K QP ++K+ + KE+ + + T+ + K+ S SL + S + + 

Sbjct: 853 SQKP--QP KTKQSKLLAAQQKKEQANKATKLDKKEKSPARSLNKEESGNES--R 902 

Query: 437 ESLXXXXXXXXXXXXXXXTVKQTISPGSTASKESEKFRTTKGSPSQSLSKPITMEKASAS 496 

S K T S +S S K SL P ++ S+ 

Sbjct: 903 SSTMGRTGKSSLVRAVGGVEKNTPKTSSKSSLHS KSDSKSSLKAPQLLQSPSSG 956 

Query : 497 SCPAPLEGREAGQASPSGSCTMTVAQSSGQSTGNGAVQLP QQQQHSHPNTATVA- 550 

p p+ + P S G GA LP Q QQ T+ ++ 

Sbjct: 957 GLPKPIAAIKGTSKLP SLGGGAGHLPAAESQQNQQLLKRETSDISS 1002 

Query: 551 PFIYRAHSENEGTALPSADSCTSPTKMDLSYSKTAKQCLEEISGEGPETR 600 

P AH TP+ + PT S+ + + fS + 

Sbjct: 1003 NISQPPPAEPPISTHAHIHQNQTPPPPYYANSQPTSHISSHGFLSEPSTPQHSSGIYGSS 1062 

Query : 601 RMRTVKNIADLRQNLEETMSSLRGTQISHSTLETTFDSTVTTEVNGRTI-PN-LTSRPTP 658 

R+ K+ + LE + +H + V + N T PN ♦ P+ 

Sbjct: 1063 RLPPPKSALSAPRKLEYNAGPHILSSPTHHQRQGLPRPLVNSAPNTPTASPNKFHTIPSK 1122 

Query: 659 MTWRLGQACPRLQAGDAPSLGAGYPRSGTSRFIHTDPSRFMY TTPLRRAAVSRLGN 714 

+ + + + + LA P SG S + P Y TPRA * 

Sbjct: 1123 IVGTI YESKEEQLLPAPPPASGGSSILPMRPLLRGYNSHVTLPTRGARGGHHPH 1176 

Query: 715 MSQIDMSEKASSDLDMSSEVDVG-GYMSDGDIL--GKS LRTDDINSGYMTDG — GLN 766 

S +D E D+G GY SDGD L G S R DI++GY+++G GLN 

Sbjct: 1177 QSYLDFCES--- DIGQGYCSDGDALRVGSSPGGSRFHDIDNGYLSEGSSGLN 1225 
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Figure 12: Illustration of an EST encoding a part of the 
Zebrafish-UNC-53/2 cDNA. 



Sbjct= emb|Al658309|Al658309 fc21d06.yl Zebrafish WashU MP IMG EST Danio 
rerio cDNA 5' similar to TR:Q20427 Q20427 F45E10.1 mRNA sequence. Length = 44! 

Score = 277 bits (702) , Expect = 4e-73 

Identities = 124/147 (84%), Positives = 136/147 (92%) 

Frame = +3 

Query 2121 LHHNFRWVLCANHTEPVXGFLGRFLRRK^ 2180 

LHHNFRW+LCANHTEPVXGFLGRFLRRKL+ETEI+ RVRN ELVKII+WIP VWHHLNRF 
Sbjct: 3 LHHNFRWILCAITOTEPVKGFLGRFLRRK^^ 182 

Query; 2181 LEAHSSSDVTIGPRLFLSCPIDVIXSSRVWFTDLWOTSIIPYLLEAVREGLQLYGRRAPV^ 2240 

LE HSSSDVTIGPRLFLSCP+DV+GSRVWFTDLWNYSIIPY+LEAVREGLQ+YGR+A WE 
Sbjct: 183 LETHSSSDVTIGPRLFLSCPMDVEGSRVWFTDLWNYS 1 1 PYMLEAVREGLQMYGRKASWE 3 62 

Query: 2241 DPAKWVMDTYPWAASPQQHEWPPLLQL 2267 

DPAKWVM++ ASPQQHEW LL+L 
Sbjct: 3 63 DPAKWVMESLLCVAS PQQHEWH SLLRL 443 



Query= hh2UNC53 



(2340 letters) 
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Figure 13. Genemap98 results for Hs-Unc53/2 





G3 Map: 


Chr.ll 


SHGC- 
33456 


Reference interval: 


Dl 1S921-D1 IS 1359 (24.9-32.5 cM) 


Physical position: 


911cR10000(F) 


RH details: 


RHdb RH32790 




Typed by: 


Stanford (see SHGC-33456} 



UniGene Hs. 13830 



RH Mapping Results 



Electronic PCR Results 



ESTs (from GenBank EST division) 



AA1 15015 zl04dl0 ; sl Scares pregnant uterus NbHPU Homo sapiens cDNA clone 491347 3' 



STS 



134 [bp: 1SHGC-33456 



AA918601 



o!53el l.sl Scares NFL T..GBC S 1 Homo sapiens cDNA clone IMAGE: 1527212 3* 



STS |16|... |143 jbp: 1SHGC-33456 



AI248585 



qh71£08.xl Soares_fetalJ^er_spleen_lNFLS_Sl Homo sapiens cDNA clone 
IMAGE: 1850151 3', mRNA sequence [Homo sapiens] 



146 | bp: | SHGC-33456 1 



STS 



19 



T71262 



yd35b09.sl Homo sapiens cDNA clone 110201 3'. 



STS [ 91 ... | 136 | bp: | SHGC-33456 



RH Map 
GB4 G3 



Genetic 
Map 



Gene 
Density 



Cytogenetic 
Ideogram 



The thick line on the G3 map indicates the position of 
SHGC-33456 See also: eguivalent interval on GB4 map 



About This Interval 


Top of interval: 


D11S921 (24.9 CM) 


Bottom of interval: 


D11S1359 (32.5 cM) 


Genetic size of bin: 


8 CM 


Physical size of bin: 


430 CR10000 
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Figure 15: Illustration of the nucleotide sequence of 
pGI3150 and amino acid sequence of the eGFP fusion with a 
C- terminal fragment of Hs-Unc-53/1. 



Hu-unc-53/1 




— KanNeo 



eGFP 



ID 
DE 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
Ft 
FT 
SQ 



PGI3150 
from coiled coil I till end 



circular DNA; 7 655 BP 



CDS 



CDS 



CDS 



SEQUENCE 



1225. .2019 
/vntifkey= M 4" 
/ /label=KanNeo 
3942. .4658 
/vntifkey= ,, 4 " 
/label=eGFP 
4719 . .7214 
/vntifkey= " 4 " 
/label=Hu-unc-53/l 
7655 BP; 



CTAGATAACT 
ACCTCCCACA 
TGTTTATTGC 
AAGCATTTTT 
GCGTAAATTG 
TCATTTTTTA 
GAGATAGGGT 
TCCAACGTCA 
CCCTAATCAA 
AGCCCCCGAT 
AAAGCGAAAG 
ACCACACCCG 
AATGTGCGCG 
ATGAGACAAT 
GAAAGAACCA 
CAGGCAGAAG 
CAGGCTCCCC 
TCCCGCCCCT 
CCCATGGCTG 
TATTCCAGAA 
GAGACAGGAT 
GCCGCTTGGG 
GATGCCGCCG 
CTGTCCGGTG 
ACGGGCGTTC 



GATCATAATC 
CCTCCCCCTG 
AG CTT AT AAT 
TTCACTGCAT 
TAAGGGTTAA 
ACCAATAGGC 
TGAGTGTTGT 
AAGGGCGAAA 
GTTTTTTGGG 
TTAGAGCTTG 
GAGCGGGCGC 
CCGCGCTTAA 
GAACCCCTAT 
AACCCTGATA 
GCTGTGGAAT 
TATGCAAAGC 
AGCAGGCAGA 
AACTCCGCCC 
ACTAATTTTT 
GTAGTGAGGA 
GAGGATCGTT 
TGGAGAGGCT 
TGTTCCGGCT 
CCCTGAATGA 
CTTGCGCAGC 



AG CC AT AC CA 
AACCTGAAAC 
GGTTACAAAT 
TCTAGTTGTG 
TATTTTGTTA 
CGAAATCGGC 
TCCAGTTTGG 
AACCGT CT AT 
GTCGAGGTGC 
ACGGGGAAAG 
TAGGGCGCTG 
TGCGCCGCTA 
TTGTTTATTT 
AATGCTTCAA 
GTGTGTCAGT 
ATGCATCTCA 
AGTATGCAAA 
ATCCCGCCCC 
TTTATTTATG 
GGCTTTTTTG 
TCGCATGATT 
ATT CGGCT AT 
GTCAGCGCAG 
ACTGCAAGAC 
TGTGCTCGAC 



CATTTGTAGA 
ATAAAATGAA 
AAAGCAATAG 
GTTTGTCCAA 
AAATTCGCGT 
AAAATCCCTT 
AACAAGAGTC 
CAGGGCGATG 
CGTAAAGCAC 
CCGGCGAACG 
G C AAGTG TAG 
CAGGGCGCGT 
TTCTAAATAC 
TAATATTGAA 
TAGGGTGTGG 
ATTAGTCAGC 
GCATGCATCT 
TAACTCCGCC 
CAGAGGCCGA 
GAGGGCTAGG 
GAACAAGATG 
GACTGGGCAC 
GGGCGCCCGG 
GAGGCAGCGC 
GTTGTCACTG 



GGTTTTACTT 
TGCAATTGTT 
CATCACAAAT 
ACTCATCAAT 
TAAATTTTTG 
ATAAATCAAA 
CACTATTAAA 
GCCCACTACG 
TAAATCGGAA 
TGGCGAGAAA 
CGGTCACGCT 
CAGGTGGCAG 
ATTCAAATAT 
AAAGGAAGAG 
AAAGTCCCCA 
AACCAGGTGT 
CAATTAGTCA 
CAGTTCCGCC 
GGCCGCCTCG 
CTTTTGCAAA 
GATTGCACGC 
AACAGACAAT 
TTCTTTTTGT 
GGCTATCGTG 
AAGCGGGAAG 



GCTTTAAAAA 
GTTGTTAACT 
TTCACAAATA 
GTATCTTAAC 
TTAAATCAGC 
AG AAT AG AC C 
GAACGTGGAC 
TGAACCATCA 
CCCTAAAGGG 
GGAAGGGAAG 
GCGCGTAACC 
TTTTCGGGGA 
GTATCCGCTC 
TCCTGAGGCG 
GGCTCCCCAG 
GGAAAGTCCC 
GCAACCATAG 
CATTCTCCGC 
GCCTCTGAGC 
GATCGATCAA 
AGGTTCTCCG 
CGGCTGCTCT 
CAAGACCGAC 
GCTGGCCACG 
GGACTGGCTG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
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CTATTGGGCG AAGTGCCGGG GCAGGATCTC CTGTCATCTC ACCTTGCTCC TGCCGAGAAA 
GTATCCATCA TGGCTGATGC AATGCGGCGG CTGCATACGC TTGATCCGGC TACCTGCCCA 
TTCGACCACC AAGCGAAACA TCGCATCGAG CGAGCACGTA CTCGGATGGA AGCCGGTCTT 
GTCGATCAGG ATGATCTGGA CGAAGAGCAT CAGGGGCTCG CGCCAGCCGA ACTGTTCGCC 
AGGCTCAAGG CGAGCATGCC CGACGGCGAG GATCTCGTCG TGACCCATGG CGATGCCTGC 
TTGCCGAATA TCATGGTGGA AAATGGCCGC TTTTCTGGAT TCATCGACTG TGGCCGGCTG 
GGTGTGGCGG ACCGCTATCA GGACATAGCG TTGGCTACCC GTGATATTGC TGAAGAGCTT 
GGCGGCGAAT GGGCTGACCG CTTCCTCGTG CTTTACGGTA TCGCCGCTCC CGATTCGCAG 
CGCATCGCCT TCTATCGCCT TCTTGACGAG TTCTTCTGAG CGGGACTCTG GGGTTCGAAA 
TGACCGACCA AGCGACGCCC AACCTGCCAT CACGAGATTT CGATTCCACC GCCGCCTTCT 
ATGAAAGGTT GGGCTTCGGA ATCGTTTTCC GGGACGCCGG CTGGATGATC CTCCAGCGCG 
GGGATCTCAT GCTGGAGTTC TTCGCCCACC CTAGGGGGAG GCTAACTGAA ACACGGAAGG 
AGACAATACC GGAAGGAACC CGCGCTATGA CGGCAATAAA AAGACAGAAT AAAACGCACG 
GTGTTGGGTC GTTTGTTCAT AAACGCGGGG TTCGGTCCCA GGGCTGGCAC TCTGTCGATA 
CCCCACCGAG ACCCCATTGG GGCCAATACG CCCGCGTTTC TTCCTTTTCC CCACCCCACC 
CCCCAAGTTC GGGTGAAGGC CCAGGGCTCG CAGCCAACGT CGGGGCGGCA GGCCCTGCCA 
TAGCCTCAGG TTACTCATAT ATACTTTAGA TTGATTTAAA ACTTCATTTT TAATTTAAAA 
GGATCTAGGT GAAGATCCTT TTTGATAATC TCATGACCAA AATCCCTTAA CGTGAGTTTT 
CGTTCCACTG AGCGTCAGAC CCCGTAGAAA AGATCAAAGG ATCTTCTTGA G AT CCTTTTT 
TTCTGCGCGT AATCTGCTGC TTGCAAACAA AAAAACCACC GCTACCAGCG GTGGTTTGTT 
TGCCGGATCA AGAGCTACCA ACTCTTTTTC CGAAGGTAAC TGGCTTCAGC AGAGCGCAGA 
TACCAAATAC TGTCCTTCTA GTGTAGCCGT AGTTAGGCCA CCACTTCAAG AACTCTGTAG 
CACCGCCTAC ATACCTCGCT CTGCTAATCC TGTTACCAGT GGCTGCTGCC AGTGGCGATA 
AGTCGTGTCT TACCGGGTTG GACTCAAGAC GATAGTTACC GGATAAGGCG CAGCGGTCGG 
GCTGAACGGG GGGTTCGTGC ACACAGCCCA GCTTGGAGCG AACGACCTAC ACCGAACTGA 
GATACCTACA GCGTGAGCTA TGAGAAAGCG CCACGCTTCC CGAAGGGAGA AAGGCGGACA 
GGTATCCGGT AAGCGGCAGG GTCGGAACAG GAGAGCGCAC GAGGGAGCTT CCAGGGGGAA 
ACGCCTGGTA TCTTTATAGT CCTGTCGGGT TTCGCCACCT CTGACTTGAG CGTCGATTTT 
TGTGATGCTC GTCAGGGGGG CGGAGCCTAT GGAAAAACGC CAGCAACGCG GCCTTTTTAC 
GGTTCCTGGC CTTTTGCTGG CCTTTTGCTC ACATGTTCTT TCCTGCGTTA TCCCCTGATT 
CTGTGGATAA CCGTATTACC GCCATGCATT AGTTATTAAT AGTAATCAAT TACGGGGTCA 
TTAGTTCATA GCCCATATAT GGAGTTCCGC GTTACATAAC TTACGGTAAA TGGCCCGCCT 
GGCTGACCGC CCAACGACCC CCGCCCATTG ACGTCAATAA TGACGTATGT TCCCATAGTA 
ACGCCAATAG GGACTTTCCA TTGACGTCAA TGGGTGGAGT ATTTACGGTA AACTGCCCAC 
TTGGCAGTAC ATCAAGTGTA TCATATGCCA AGTACGCCCC CTATTGACGT CAATGACGGT 
AAATGGCCCG CCTGGCATTA TGCCCAGTAC ATGACCTTAT GGGACTTTCC TACTTGGCAG 
TACATCTACG TATTAGTCAT CGCTATTACC ATGGTGATGC GGTTTTGGCA GTACATCAAT 
GGGCGTGGAT AGCGGTTTGA CTCACGGGGA TTTCCAAGTC TCCACCCCAT TGACGTCAAT 
GGGAGTTTGT TTTGGCACCA AAATCAACGG GACTTTCCAA AATGTCGTAA CAACTCCGCC 
CCATTGACGC AAATGGGCGG TAGGCGTGTA CGGTGGGAGG TCTATATAAG CAGAGCTGGT 
TTAGTGAACC GTCAGATCCG CTAGCGCTAC CGGTCGCCAC CATGGTGAGC AAGGGCGAGG 
AGCTGTTCAC CGGGGTGGTG CCCATCCTGG TCGAGCTGGA CGGCGACGTA AACGGCCACA 
AGTTCAG CGT GTCCGGCGAG GGCGAGGGCG ATGCCACCTA CGGCAAGCTG ACCCTGAAGT 
TCATCTGCAC CACCGGCAAG CTGCCCGTGC CCTGGCCCAC CCTCGTGACC ACCCTGACCT 
ACGGCGTGCA GTGCTTCAGC CGCTACCCCG ACCACATGAA GCAGCACGAC TTCTTCAAGT 
CCGCCATGCC CGAAGGCTAC GTCCAGGAGC GCACCATCTT CTTCAAGGAC GACGGCAACT 
ACAAGACCCG CGCCGAGGTG AAGTTCGAGG GCGACACCCT GGTGAACCGC ATCGAGCTGA 
AGGGCATCGA CTTCAAGGAG GACGGCAACA TCCTGGGGCA CAAGCTGGAG TACAACTACA 
ACAGCCACAA CGTCTATATC ATGGCCGACA AGCAGAAGAA CGG CATCAAG GTGAACTTCA 
AGATCCGCCA CAACATCGAG GACGGCAGCG TGCAGCTCGC CGACCACTAC CAGCAGAACA 
CCCCCATCGG CGACGGCCCC GTGCTGCTGC CCGACAACCA CTACCTGAGC ACCCAGTCCG 
CCCTGAGCAA AGACCCCAAC GAGAAGCGCG ATCACATGGT CCTGCTGGAG TTCGTGACCG 
CCGCCGGGAT CACTCTCGGC ATGGACGAGC TGTACAAGTC CGGACTCAGA TCTCGAGCTC 
AAGCTTCGAA TTCTGCAGTC GACGGTACCG CGGGCCCGGG ATCCTTCCGA GACCCCACGG 
ACGATGTTCA CGGCTCAGTG CTGTCCCTGG CCTCCAGTGC CTCCTCCACC TACTCCTCAG 
CTGAGGAGAG GATGCAATCT GAG C AAATCC GGAAGCTTCG TAGGGAACTG GAATCATCCC 
AGGAAAAAGT GGCCACCTTG ACGTCTCAGC TTTCTGCCAA TGCTAATCTG GTGGCTGCTT 
TTG AGC AG AG CCTGGTGAAT ATGACATCCC GCCTGCGACA CCTGGCAGAG ACGGCCGAGG 
AGAAGGACAC TGAGCTGCTG GATTTGCGAG AAAC CAT AG A CTTTCTGAAG AAAAAGAACT 
CTGAGGCCCA GGCAGTCATT CAGGGAGCCC TTAATGCCTC AGAAACCACA CCCAAAGAAC 
TTCGGATCAA GAGACAAAAC TCCTCAGATA GCATCTCAAG CCTCAACAGC ATC ACT AG CC 
ATTCCAGCAT CGG C AGC AGC AAGGATGCTG ATGCGAAAAA GAAGAAAAAA AAGAGTTGGG 
TCTATGAGCT TCGAAGTTCC TTCAACAAAG CGTTCAGTAT AAAAAAGGGG CCCAAGTCAG 
CTTCCTCATA CT CGG AT AT A GAGGAGATTG CTACACCCGA CTCTTCAGCC CCCTCATCCC 
CCAAACTACA GCATGGTTCC ACAGAGACTG CTTCACCCTC CATCAAGTCC TCCACCTTGT 
CCTCCGTGGG CACTGATGTC ACCGAGGGCC CTGCTCACCC AGCCCCCCAC ACTAGGCTGT 
TCCATG C AAA TGAGGAGGAG GAGCCAGAGA AGAAGGAGGT ATCGGAGCTG CGCTCTGAGC 
TATGGGAGAA GGAAATGAAG CTTACAGACA TCCGCTTGGA GGCCCTCAAC TCTGCCCACC 
AACTGGATCA GCTTCGGGAG ACCATGCACA ACATGCAGTT GGAGGTGGAC CTGCTGAAAG 
CAGAGAATGA CCGACTGAAG GTAGCCCCAG GCCCCTCATC AGGCTCCACT CCAGGGCAGG 
TCCCTGGATC ATCTGCATTA TCTTCCCCAC GCCGCTCCCT AGGCCTGGCA CTCACCCATT 
CCTTCGGCCC CAGTCTTGCA GACACAGACC TGTCACCCAT GGATGGCATC AGT AC TTGTG 
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GTCCAAAGGA 
AAGGGGACTT 
ACTGGAAGAT 
ACCCAGCCTC 
TGAAACGAGT 
ACATATCAGT 
CGCTGATCCC 
TCGTCCTCTC 
ACCTGGTGGA 
ACCAGCAGTC 
GGGAAACAGG 
GCTCCATCAG 
TTATAGGTAC 
TCAGGATGTT 
TGAGGAGGAA 
GGGTGCTCGA 
GCACCTCAGA 
ACTTCCGGAC 
GAGCCAAGGA 
GGGTCCGGGA 
TGCCCCCACC 
AAGACAGCAC 
AAGAAGCTGC 
AGGCAACACT 
AGCTATCTTA 
GAGGAGAACA 
TTGAGAACTT 
CATTTACTGG 
TTCTTGTTTC 
ACTG CAGCAG 
GCCGAATTCC 



GGAAGTGACC 
GAAGCAGCAG 
GCTGGATGAA 
TACCCTGGGA 
GTTGGATGCA 
CTCCCTCAAA 
CAAGCCGATG 
GGGCCCCAGC 
GCGCTCTGGC 
TTGCAAGGAT 
AATTGGGGAT 
TGAGTTGGTC 
CACCAATCAG 
GACCTTCTCC 
GCTGGTAGAG 
CTGGGTACCC 
CTTCCTCATC 
CTGGTTCATT 
TGGGATAAAG 
CACACTTCCC 
CACCGTGGGC 
CCCAAGTTCT 
CAACTACATT 
TTAAGGGTTC 
GCTCCTCCTC 
GGAGGGAGGA 
CCTAGGAAGG 
CCTCCTCTAA 
AATTACAAAC 
TTCCCCGGAA 
AGCACACTGG 



CTCCGGGTGG 
GAATTCTTCC 
GCTGTTTTCC 
CTAAGCACTG 
GAGCCCCCCG 
GGTCTGAAGG 
ATGCAGCACT 
GGCACGGGCA 
CGTGAGGTCA 
CTGCAACTGT 
GTGCCCCTGG 
AATGGGGCCC 
CCTGTAAAAA 
AACAACGTGG 
TCAGACAGCG 
AAGCTGTGGT 
GGCCCTTGCT 
GACCTGTGGA 
GTCCATGGAC 
TGGCCATCAG 
CCTCACAGCA 
CTGGACTCAG 
G AGTCTC C AG 
GGCAATCACT 
TCCCCTCTCC 
GGAGATGAAA 
AATGGTGGGG 
TGACTTTGGG 
TCCTGGGCTT 
TTCAG CTTGG 
CGGCCGTTAC 



TGGTGAGGAT 
TGGGCTGTAG 
AAGTGTTCAA 
AGTCCATCCA 
AGATGCCTCC 
AGAAATGCGT 
ACATAAGCCT 
AGACCTACCT 
CAGAGGGCAT 
ATCTTTCCAA 
TGATTCTATT 
TCACCTGCAA 
TGACACCCAA 
AGCCAGCCAA 
ACATCAATGC 
ATCATCTCCA 
TCTTTCTGTC 
ACAACTCTAT 
AGAAAGCTGC 
CCCAACAAGA 
TTGCCTCACC 
ATCCTCTGAT 
ATCGAGAAAC 
GTCACCCCCG 
TCTTTCAGAG 
GAGGAGGGAC 
TGGCGTTTGG 
GAAAAGATGA 
TCTGGGGAGG 
ACTTAACCAG 
TAGTT 



GCCCCCGCAG 
CAAGGTCAGT 
GGACTATATT 
TGGCTACAGC 
TTGCCGTCGA 
CGACAGCCTG 
CCTGCTGAAG 
GACCAATCGC 
CGTCAGCACC 
CCTAGCCAAC 
GGATGACCTG 
GTATCATAAA 
CCATGGCTTG 
TGGCTTCCTG 
CAACAAGGAA 
CACCTTCCTT 
GTGTCCCATT 
CATTCCCTAT 
TTGGGAGGAC 
CCAATCAAAG 
TCCCGAGGAT 
GGCCATGCTG 
CATCCTGGAC 
GACAGCAGAA 
CACTGGCTCT 
AGGTTCTTGG 
GAACTTGTGC 
TTCTGGGTCT 
GGTTCAGAAA 
GCTGAACTTG 



CACATCATCA 
GGAAAAGTTG 
TCTAAAATGG 
ATCAGCCACG 
GGTGTCAATA 
GTGTTCGAGA 
CACCGGCGCC 
TTGGCCGAGT 
TTCAACATGC 
CAGATAGACC 
AGTGAAGCAG 
TGTCCCTATA 
CACTTGAGCT 
GTTCGTTACC 
GAGCTGCTTC 
GAGAAGCACA 
GGCATTGAGG 
CTACAGGAAG 
CCAGTGGAAT 
CTGTACCACC 
AGGACAGTCA 
CTGAAACTTC 
CCCAACCTTC 
CGCTGGCATC 
CCAGCCCCAG 
TGCTGTACCT 
CCCCTAAACA 
TTCCCTTGAC 
ACATCAAAAC 
CTCAAAAGAA 
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MVSKGEELFTGWPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTLTYGVQC 
FSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKL 
EYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQL 

EKRDHMVLLEFVTAAGITLGMDELYKSGLRSRAQASNSAVDGTAGPGSFRDPTDDVHGSVLSLASSASSTY 

SSAEERMQSEQIRKLRRELESSQEKVATLTSQLSANANLVAAFEQSLVNMTSRLRHLAETAEEKDTELLDL 

RETIDFLKKKNSEAQAVIQGALNASETTPKELRIKRQNSSDSISSLNSITSHSSIGSSKDADAKKKKKKSW 

VYELRSSFNKAFSIKKGPKSASSYSDIEEIATPDSSAPSSPKLQHGSTETASPSIKSSTLSSVGTDVTEGP 

AHPAPHTRLFHANEEEEPEKKEVSELRSELWEKEMKLTDIRLEALNSAHQLDQLRETMHNMQLEVDLLKAE 

NDRLKVAPGPSSGSTPGQVPGSSALSSPRRSLGLALTHSFGPSLADTDLSPMDGISTCGPKEEVTLRVVVR 

MPPQHIIKGDLKQQEFFLGCSKVSGKVDWKMLDEAVFQVFKDYISKMDPASTLGLSTESIHGYSISHVKRV 

LDAEPPEMPPCRRGVNNISVSLKGLKEKCVDSLVFETLIPKPMMQHYISLLLKHRRLVLSGPSGTGKTYLT 

NRLAEYLVERSGREVTEGIVSTFNMHQQSCKDLQLYLSNLANQIDRETGIGDVPLVILLDDLSEAGSISEL 

WGALTCKYHKCPYIIGTTNQPVKMTPNHGLHLSFRMLTFSNNVEPANGFLVRYLRRKL 

ELLRVLDWVPKLWYHLHTFLEKHSTSDFLIGPCFFLSCPIGIEDFRTWFIDLWNNSIIPYLQEGAKDGIKV 

HGQKAAWEDPVEWVRDTLPWPSAQQDQSKLYHLPPPTVGPHSIASPPEDRTVKDSTPSSLDSDPLMAMLLK 

LQEAANYI ES PDRET ILDPNLQATL 
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Figure 16: EST Clone yk480b6 contains a splice variant of 
Ce-UNC-53 



SIMILARITY SCALE 
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0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 
Results of SIM with: 

Sequence 1 : Ce-unc-53, (1583 residues) Sequence 2: yk480b06rc, (1556 residues) 

Ce-UNC-53 110 L STYKQKI^QLKKDQKKLEQL PTS IMP PAVS KL PS PRVATS AT AS ATNPNSNF PQMSTSR 

vk48 0b06rc 5 ifQETO^RQLKKDQKKLEQLPTSIMPPAV^ 

* •** ******************************** ******************* 

Ce-UNC-53 170 LQTPQSRISKIDSSKIGIKPKTSGLKPPSSSTTSSNNTNSFRPSSRSSGNNNVGSTISTS 

Vk480b06rc 65 LQTPQSRISKIDSSKIGIKPKTSGLKPPSSSTTSSNNTNSFRPSSRSSGNNNVGSTISTS 

************************************************************ 
Ce-UNC-53 23 0 AKSLESSSTYSSISNLNRPTSQLQKPSRPQTQLVRVATTTKIGSSKLAAPKAVSTPKLAS 
Vk4 8 0b 0 6 r c 125 AKSLESS STYS SI SNLNRPTSQLQKPSRPQTQLVRVATTTKIGSSKLAAPKAVSTPK1AS 

************************************************************ 
Ce-UNC-53 290 VKTIGAKQEPDNSGGGGGGMLKLKLFSSKNPSSSSNSPQPTRKAAAVPQQQTLSKIAAPV 
yk480b06rc 185 VKTIGAKQEPDNSGGGGGGMLKLKLFSSKNPSSSSNSPQPTRKAAAVPQQQTLSKIAAPV 

************************************************************ 
Ce-UNC-53 3 50 KSGLKPPTSKLGSATSMSKLCTPKVSYRKTDAPIISQQDSKRCSKSSEEESGYAGFNSTS 
yk480b06rc 245 KSGLKPPTSKLGSATSMSKLCTPKVSYRKTDAPIISQQDSKRCSKSSEEESGYAGFNSTS 

************************************************************ 
Ce-UNC-53 410 PTSSSTEGSLSMHSTSSKSSTSDEKSPSSDDLTLNASIVTAIRQPIAATPVSPNIINKPV 

yk480b06rc 305 PTSSSTEGSLSMHSTSSKSSTSDEKSPSSDDLTLNASIVTAIRQPIAATPVSPNIINKPV 

************************************************************ 
Ce-UNC-53 470 EEKPTIAVKGVKSTAKKDPPPAVPPRDTQPTIGWSPIMAHKKLTNDPVISEKPEPEKLQ 
yk480b06rc 3 65 EEKPTLAVKGVKSTAKKDPPPAVPPRDTQPTIGWSPIMAHKKLTNDPVISEKPEPEKLQ 
. ************************************************************ 

C e -UNC - 5 3 530 SMS IDTTDVPPLPPLKSWPLKMTS IRQPPT YDVLLKQGKITS PVKSFGYEQS SASEDS I 

yk480b06rc 425 SMS IDTTDVPPLPPLKSWPLKMTS IRQ PPT YDVLLKQGKITS PVKSFGYEQS SASEDS I 
************************************************************ 

Ce -UNC - 5 3 59 0 VAHASAQVTPPTKTSGNHSLERRMGKNKTSESSGYTSDAGVAMCAKMREKLKEYDDMTRR 

y k4 8 ObO 6 rc 485 VAHA S AQ VT P PTKT S GNHS LE RRMG KNKT S E S S G YT S DAGVAMC AKMREKLKE YDDMTRR 
************************************************************ 

Ce -UNC - 5 3 6 50 AQNGYPDNFEDSS SLSSGI SDNNELDDI STDDLSGVDMATVASKHSDYSHFVRHPTS SSS 

yk4 8 ObO 6 r c 545 AQNGYPDNFEDSS SLSSGI SDNNELDDI STDDLSGVDMATVASKHSDYSHFVRHPTS SSS 
************************************************************ 

Ce-UNC-53 710 KPRVPSRSSTSVDSRSRAEQENVYKLLSQCRTSQRGAAATSTFGQHSLRSPGYSSYSPHL 

yk480b06rc 6 05 KPRVPSRSSTSVDSRSRAEQENVYKLLSQCRTSQRGAAATSTFGQHSLRSPGYSSYSPHL 

Ce-UNC-53 770 SVSADKDTMSMHSQTSRRPSSQKPSYSGQFHSLDRKCHLQEFTSTEHRMAALLSPRRVPN 

y k4 8 ObO 6 rc 665 SVS ADKDTMSMHSQTSRRPSSQKPS YSGQFHSLDRKCHLQEFTSTEHRMAALLSPRRVPN 
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Ce-UNC-53 830 SMSKYDS S - - 

yk480b06rc 725 SM5KYDS ^AAALNASGMSRSMILLESLS?RPPRRHQSPADSCIITASPSAPRRSHSPRGP| 
******** 

Ce-UNC-53 838 GSYSARSRGGSSTGIYGETFQLHRLSDEKSPAHSAKSEMGS 

yk480b06rc 785 | TAR I PLS LAS S PVHVNNNVte SYSARSRGGSSTGIYGETFQLHRLSDEKSPAHSAKSEMGS 

** *************************************** 

Ce -UNC - 53 879 QLS LASTTAYGSLNEKYEHAI RDMARDLECYKNTVDS LTKKQENYGALFDLFEQKLRKLT 

yk4 8 ObO 6r c 845 QLS IASTTAYGSLNEKYEHAI RDMARDLECYKNTVDS M 

************************************************************ 

Ce-UNC-53 939 QHIDRSNLKPEEAIRFRQDIAHLRDISNHLASNSAHANEGAGELLRQPSLESVASHRSSM 

yk480b06rc 9 05 QHIDRSNLKPEEAIRFRQDIAHLRDISNHLASNSAHANEGAGELLRQPSLESVASHRSSM 

Ce-UNC-53 999 SSSSKSSKOEKISLSSFGKNKKS W IRSSLSKFTKKKNKNYDEAHMPSI SGSQG 

yk480b06rc 965 SSS SKSSKQEKISLSSFGKNKKSV jALSVDSij lRSSLSKFTKKKNKNYDEAHMPSI SGSQG 
************************ ***************************** 

Ce-UNC-53 1052 TLDNIDVIELKQEIJfCERDSALYKTOLDNLJDRARE 

yk480b06rc 1025 TLDNI DVIELKQELKERDSALYEVRI^NLDRAREVDVLRETVNKLKTENKQLKKEV^ 

************************************************************ 

Ce-UNC-53 1112 NGPATRASSRASIPVIYDDEHVYDAACSSTSASQSSKRSSGCNSIKVTVNVDIAGEISSI 
yk480b06rc 10 85 NGPATRASSRASIPVIYDDEHVYDAACSSTSASQSSKRSSGCNSIKVTVNVDIAGEISSI 

Ce-UNC-53 1172 VNPDKEIIVGYLAMSTSQSCWKDIDVSILGLFEVYLSRIDVEHQLGIDARDSILGYQIGE 
yk480b06rc 1145 VNPDKEIIVGYLAI^TSQSCWKDIDVSILGLFEVYLSRIDVEHQLGIDARDSILGYQIGE 
************** ********************************************* 

Ce-UNC-53 123 2 LRRVIGDSTTMITSHPTDILTSSTTIRMFMHGAAQSRVDSLVLDMLLPKQMILQLVKSIL 
yk480b06rc 1205 LRRVIGDSTTMITSHPTDILTSSTTIRMFMHGAAQSRVDSLVLDMUJPKQMILQLVKSIL 

Ce-UNC-53 1292 TERRLVIAGATGIGKSKIJUCTUSAYVSIRTNQSEDSIVNISIPENNKEELIiQVERRLEKI 
yk480b06rc 1265 TERRLVLAGATGIGKSKLAKTLAAYVSIRTNQSEDSIVNISIPENNKEELLQVERKLEKI 

Ce-UNC-53 13 52 LRS KES C I V I LDN I P KNRI AF WSWANVPLQNNEGP FVVCTVNRYQ I PELQ I HHNFKMS 
yk480b06rc 1325 LRSKESC IVILDNIPKNRIAFWSVFANVPLQNNEGPFVVCTVNRYQIPELQIHHNFK^ 

Ce-UNC-53 1412 VMSNRLEGF ILRYLRRRAVEDEYRLTVQMPSELFKI I DFFPI ALQAVNNFIEKTNSVDVT 
yk480b06rc 13 85 VMSNRLEGF ILRYLRRRAVEDEYRLTVQMPSELFKI IDFFPI ALQAVNNFIEKTNSVDVT 

Ce-UNC-53 147 2 VG P RAC LNC PLTVDG S REWF I RLWNEN F I P YL E RV AR DGKKT FGRCT S F ED PT DI VS KKW 
yk480b06rc 14 45 VGPRACLNCPLTVDGSREWFIRLWNENFIPYLERVARDGKKTFGRCTSFEDPTDIVSgKW 

Ce-UNC-53 153 2 PWF DG ENPENVLKRLQLQDLVP S PANS S RQH FNPL ES L I QLHAT KHQT I DNI 
yk480b06rc 1505 PWFDGENPENVLKRLQLQDLVPSPANSSRQHFNPLESLIQLHATKHQTIDNI 

Legend: the alternative splices and the mutation (S-P) are 
indicated in red and are boxed. 
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